Software Application Testing in Insurance, Part II: Getting Test Data
17 December 2008
Previous posts about testing Topic 1: Automated TestingTopic 2: Getting Test Data Bad test data can mean that the best tests fail to predict real world problems. While many test topics apply to all industries, insurance carriers face some unique issues when it comes to getting good test data. Due to HIPAA and other industry regulations, utilizing real data for testing is a gray area, as the test team does not necessarily need to be working with real data to do their jobs. It’s a difficult task to take real data and “clean” it for testing. It’s also a difficult task to generate good test data from scratch, though this is really the best solution. An insurer should take the time to have a developer create a small application/utility that generates test data specifically for the application being tested. This utility should be generating random data but follow a set of rules to keep the data within the bounds of reality. It should intentionally create “edge cases” that might stress the system and reveal errors. It should be easily adjusted to create small data sets for simple tests and very large data sets for performance/scalability tests. While it may take a few days to implement this utility, it will save a lot of time later. Instead of struggling for a half a day every time tests need to be run (a common complaint), the work to manage this will be completed up front. Unfortunately, since every software application has different data needs, this kind of utility will likely have to be written separately (or at least significantly rewritten) for each new application that needs to be tested. 90% or more of tests should be run against a very small set of data. Running tests against a huge database is unnecessary, will slow down the tests themselves, and will complicate things. Most tests are meant to verify very specific issues and there is no reason the database needs to contain any more than the bare minimum of data. Only a very small number of tests need to be run against a large database. Many insurers simply copy over their entire real-world database and then run tests against it. This not only creates security issues but makes the job harder for development and quality assurance teams.