Methods of data masking in a data lakehouse
To Nha Notes | July 22, 2022, 9:44 a.m.

- Encryption: We covered encryption in detail in the last section. You can also use any of those encryption methods for data masking.
- Scrambling: Scrambling is a basic masking technique that jumbles the characters and numbers into a random order, thus hiding the original content. For example, an ID number of 1234 in a production database could be replaced by 4321 in a test database.
- Nulling Out: The nulling out data masking technique replaces the sensitive data with a null value so that unauthorized users don't see the actual data. The data appears to be null or missing.
- Value Variance: Original data values are replaced by a function that replaces the original data with the output value of the function. For example, suppose a customer purchases several products. In that case, the masking method can replace the purchase price with a range between the highest and lowest price paid.
- Substitution: In the substitution data masking technique, the sensitive data is substituted with another value. The substitution technique is one of the most effective data masking methods, which preserves the original look, such asthe look and feel of the data.
- Shuffling: The data shuffling technique involves moving data within rows in the same column. This technique is like substitution. However, the data values are switched within the same dataset in this case. The data is rearranged in each column using a random sequence.
Source in the book Data Lakehouse in Action