GDPR - Test data management and data masking

Address the issues of disk space, data verification, data confidentiality, and protracted test duration

IBM i Application Quality Management Data is at the heart of any enterprise application and Test Data is the heart of a good test environment. Regulations, such as GDPR and HIPAA, mandate the data masking of personally identifiable information to comply with regulations. Test Data should not contain any information that can be used to identify an individual.

Our Test Data Management solution addresses issues of disk space, data verification, data confidentiality, and protracted test duration. Control and management of test data ensure that every test starts with a consistent and known data state, which is essential for effective testing.

Checking both the visible test results and the changes to the underlying data is a key principle of AQM (Application Quality Management), a task that is practically impossible to accomplish manually.

Understand the key principals and techniques relating to test data environments on the IBM i, IBM iSeries & IBM AS/400.

What is GDPR?

The General Data Protection Regulation (GDPR) is a European Union (EU) directive that came into effect in May 2018. It protects the privacy of EU residents and gives them control of their data, how it is used, and the right to see that data. If you are wondering what will happen in the UK after the Brexit transition period, these regulations have also been enshrined in UK law in the form of the Data Protection Act 2018.

The regulations apply to any personal data held in the EU irrespective of where the individual lives, and to personal data of any EU or UK citizen held outside of the EU. Access to information that can be used to identify an individual must be limited to those with a need to access that information.

For example, a person responsible for shipping orders needs access to a customer’s name and address, and the items that they have ordered, to fulfill the customer’s request. They do not need, nor should they have, access to that individual’s credit card details, date of birth, etc.

It is also imperative that data relating to an individual must be accurate.

What do the regulations mean for Test Data

Imagine that your test system holds details of an individual’s bank account balance and, for testing purposes, you have made the account overdrawn. What happens when that individual request’s details of the information you hold about them? Will they be happy to learn that, in one of your systems, their account is showing a negative balance? More importantly, have they permitted you to use that data for testing purposes?

It has generally been accepted that ‘live’ data should not be used for testing, and in some cases. This has effectively been mandated (HIPAA in the US, for example). With GDPR, it is wholly inappropriate to use data that can be used to identify and individual, for testing purposes.

GDPR refers to Pseudonymisation, a process that transforms personal data in such a way as it cannot be attributed to a real person. Pseudonymized data can “no longer be attributed to a specific data subject without the use of additional information”, according to GDPR legislation. That means that you need to limit the potential exposure and pseudonymize the data.

So, what can you do about Test Data?

It is often claimed by organizations that the Terms and Conditions of doing business with them state that personal information may be used for testing purposes. While this may be true, it does not excuse them from taking all precautions to protect that data.

The obvious thing to do is to scramble, de-identify, obfuscate, and/or mask sensitive information. This requires an analysis of the data to highlight where and how sensitive data is stored. This is likely to already have been done during preparations for GDPR. Next, you need to decide what to do to the data to remove the ability to identify a real person from it. This means choosing the most appropriate pseudonymization method.

Reducing the size of the test data, through sub-setting, will limit the potential for non-compliance; sub-setting will also make testing easier and quicker, saving time and increasing the speed at which you can get business-critical changes into production.

Test Data Management Solutions

Original Software’s TestBench provides code-free, easy to define mechanisms for the pseudonymization of data and uses data masking techniques to mask personally identifiable information, while still retaining the formatting and other data properties that are important for testing.

TestBench provides an easy to use, code-free solution for the extraction and sub-setting of data to create test data environments. As with pseudonymization, data referential integrity is always maintained.

The result is a smaller, more focused test data environment with no information that can be attributed to a real individual.

Our Products

A modular proven test data management and verification solution.

Data Extraction

Stop copying the entire live database and hone in on the data you really need. Select or sample data with full referential integrity preserved.

Data Masking

Simply decide which fields need to be protected and use a variety of obfuscation methods to protect your data.

Data Validation

Track every insert, update and delete including intervening data states. Create rules so that data failures are flagged to you automatically.

Data Reset

Avoid the painful save/restores and stop attempting to explain bad test results based on poor initial data.

File and Report Compare

Comparing outputs is a well proven method to verify your test results but it can be laborious and prone to error.This unique solution can save hours. ​

Unit Testing

Get under the covers and analyse at a program/module level what happens in the database, APIs, parameters, messages and beyond.


