Building a Practical Encryption System
In this section, I’ll wrap up the chapter by describing a practical, real-world system that illustrates the encryption and hashing concepts we’ve been discussing throughout this chapter.
Sometimes your encrypted data will need to be matched with incoming data. For instance, many Customer Relationship Management (CRM) applications use different attributes of customers, such as credit card numbers, passport numbers, etc., to identify unique customers. Medical applications may need to go through the patient’s diagnosis history to project a pattern and suggest treatment options. Insurance applications may need to search patient diagnoses to assess the validity of the claims, and so on. Because these data items are stored in an encrypted manner, the matching applications cannot simply match against the stored data.
There are two options for handling such situations:
- Encrypt the data to be matched, and match it against the stored encrypted values
This option is possible only if the encryption key is known. If your approach is to have one key per database (or table or schema), then you know the exact key that must have been used to encrypt the values. On the other hand, if your approach is to use one key per row, then you will have an idea of what key must have been used to encrypt the value in that particular row. Hence, you can’t use this approach.
The other issue with using this option is indexing. If you have an index on this encrypted column, then the index will be useful when an equality predicate is specified (e.g., "
ssn = encrypt ('123-45-6789')
“). The query will locate on the index the encrypted value of the string ’123-45-6789’ and then get the other values of the row. Because this is an equality predicate, an exact value on the index is searched and located. However, if you specified a likeness predicate (e.g., "ssn like '123-%'
“), the index will be useless. Because the b-tree structure of the index provides for the values starting with similar first parts close together, this likeness operation would have been helped by the index, had it been in clear-text. For instance, index entries for ’123-45-6789’ and ’123-67-8945’ would have been close together. But when they are encrypted, the values could be something like this:076A5703A745D03934B56F7500C1DCB4 178F45A983D5D03934B56F7500C1DCB4
Because the first characters are very different, they will be on different parts of the index. Using an index match first to determine the location in the table will be slower than doing a full table scan.
- Decrypt the encrypted data in each row and match it against the cleartext value to be matched
If you use one single key per row, this is your only option. But each decryption consumes several precious CPU cycles and may affect overall database performance.
So, how can you design a system that more efficiently matches against encrypted columns? The trick is to match against a hash value, rather than the encrypted value. Creating a hash value is significantly faster than encryption, and it consumes fewer CPU cycles. Because the hashing of an input value will always produce the same value, we could store the hash value of the sensitive data, create a hash value of the match data, and match it against the stored hash value.
Here is a proposed system design. Assume that you have a table named CUSTOMERS where the credit card numbers are stored, which needs to be encrypted. Instead of storing the credit card number in the CUSTOMERS table, you would create two additional tables. Figure 4-10 shows the tables and their relationships.
- CUSTOMERS table
CUST_ID (primary key)
CC (the hash value of the credit card, not the actual credit card number itself)
- CC_MASTER table
CC_HASH (primary key)
ENC_CC# (the encrypted value of the credit card number)
- CC_KEYS table
CC_HASH (primary key)
ENC_KEY (the encryption key used to encrypt this credit card number)
The cleartext entry of the credit card is not stored anywhere. You could write a BEFORE-row INSERT or UPDATE trigger that follows the pseudo-code shown below.
1 Calculate the hash value 2 Set the value of the column CC to the hash value calculated earlier 3 Search for this hash value in CC_MASTER table 4 IF found THEN 5 Do nothing 6 ELSE 7 Generate a key 8 Use this key to generate the encrypted value of the cleartext credit card number 9 Insert a record into the CC_KEYS table for this hash value and the key 10 Insert a record in the CC_MASTER table with the encrypted value and the key. 11 END IF
This logic ensures that the cleartext credit card is not stored in the database. Applications will continue to insert the cleartext value, but the trigger will change it to a hash value. Here is the actual code for the trigger:
1 CREATE OR REPLACE TRIGGER tr_aiu_customers 2 BEFORE INSERT OR UPDATE 3 ON customers 4 FOR EACH ROW 5 DECLARE 6 l_hash VARCHAR2 (64); 7 l_enc RAW (2000); 8 l_key RAW (2000); 9 BEGIN 10 l_hash := get_hash_val (:NEW.cc); 11 12 BEGIN 13 SELECT cc_enc 14 INTO l_enc 15 FROM cc_master 16 WHERE cc_hash = l_hash; 17 EXCEPTION 18 WHEN NO_DATA_FOUND 19 THEN 20 BEGIN 21 l_key := get_key; 22 l_enc := get_enc_val (:NEW.cc, l_key); 23 24 INSERT INTO cc_master 25 (cc_hash, cc_enc 26 ) 27 VALUES (l_hash, l_enc 28 ); 29 30 INSERT INTO cc_keys 31 (cc_hash, cc_key 32 ) 33 VALUES (l_hash, l_key 34 ); 35 END; 36 WHEN OTHERS 37 THEN 38 RAISE; 39 END; 40 41 :NEW.cc := l_hash; 42 END;
The following table summarizes the logic of this code.
Lines |
Description |
10 |
First I get the hash value of the cleartext credit card number entered by the user. |
13–16 |
I see whether this value (i.e., this credit card number) exists on the CC_MASTER table. |
21 |
If the hash value is not found, then this is a new credit card. To encrypt it, I need to generate a key first. |
22 |
I use this key to encrypt the cleartext credit card. |
24–28 |
I insert this encrypted credit card number into the CC_MASTER table. |
30–34 |
I store the key in the CC_KEYS table. |
41 |
I replace the cleartext credit card number with its hash value and store that. |
Because this trigger changes the cleartext to a hash value, the application need not be changed. Programs that match credit card numbers will need to find a match against hash values, not against cleartext or encrypted values. Using this trigger and the functions described in this chapter, you can build an effective and efficient encryption infrastructure.
Get Oracle PL/SQL for DBAs now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.