Change My Mind: Column-based Data Encryption in RDBMS is the right choice.

Julyanto Sutandang
6 min readNov 5, 2024

--

Why is it the right one? Why not? Let us talk about it.

RDBMS and Encryption

We consider the reader to have enough knowledge about RDBMS and Encryption unless the content of this writing is not completely understood. But here, we show you what RDBMS is: RDBMS is a shorthand for Relational Database Management System, which has a unique purpose in the business world. Businesses need RDBMS to support and manage their transactions; no other database software can replace it. PostgreSQL and Oracle are two famous database servers used by corporations.

Encryption is a CPU time-consuming operation

Encryption is an effort made by mathematicians to ‘encrypt’ data by applying mathematical methods and approaches so that it loses its meaning and cannot be recovered without any reverse method (decryption) and the key. Therefore, encryption is a very heavy operation, occupying the CPU almost 100% when employed. Some encryption standards exist for information, such as AES, RSA, and others.

What for RDBMS encrypts data?

Some confidential data is gathered by applications since it is required by the business process and will be analyzed by the algorithm. Those data are very sensitive if revealed, and somehow, data processing, which usually involves so many parties, has a great risk of revealing data if it is not well protected with proper technology.

Protecting sensitive data is very simple: encrypt and save it in storage. However, the consequence is more manageable if we design and manage well: (a) Where will the encryption key be saved? (b) who and when the data is encrypted? © who and when decrypt data? (d) How do you put in and get the key? (e) Who has access to the key? (f) how do we control the access? (g) how to avoid the breach? (h) How can we escalate the breach as soon as possible when it happens? So on and so forth… (there are many other questions)

Moreover, Why is Column based? What are the other options?

RDBMS manages information into small chunks of data and structures it into tables, rows, and columns, a two-dimensional representation. It not only contains a two-dimensional structure but can also be a one-dimensional representation, up to n dimensions of data. However, table model representation is the simplest form and most accessible to maintain normalization. N-dimensional can be achieved by applying partitions, schema, database logic, etc.

By understanding RDBMS structures, we also perceived some concepts on how some encryption methods can be applied: table-based, Column-based, row-based, database-based, schema-based, and others. Then again, why did we choose Column? The answer is straightforward: A Column represents an attribute, a property, a feature, a dedicated meaning. So, there is less likely a need in business for having all Columns in a table encrypted. Another idea is that not all tables are treated the same; there is a transaction type table, master or reference table, and account type table. Only tables containing confidential data should be encrypted; others are unnecessary. We do not need to encrypt all columns on those tables; only some are substantial. And some of them need to be masked.

Based on those behaviors, we can conclude that not every information in the table needs encryption; only some require confidentiality. Therefore, column-based encryption is the best choice for RDBMS regarding data confidentiality. However, column-based encryption has a drawback. Since the Column is encrypted, its data cannot be searched efficiently. It usually employs sequential scanning, which results in a slow process and takes much CPU time and latency.

Deterministic Encryption

Column encryption is the fastest one to encrypt, but without specific index mechanism, run SELECT will become very slow than before, since require decryption for every row scans. Indeed that encrypted data can be indexed, only if the encrypted data is deterministic, we can index whole data, for comparison later. When a data is encrypted without using Initialization Vector (IV) or use the same one, the exact same data with resulted in the exact same encrypted data. So the data and its encrypted one has one to one mapping, that is deterministic. Having this mechanism, raising idea to create an index for the deterministic encrypted data, so there is a chance to have exact matching comparison. But, there are 2 drawbacks:

  1. Deterministic encrypted data (encryption without using Initialization Vector) lowers entropy and therefore higher chance for brute force attack.
  2. Indexing a whole encrypted data, only useful for exact match comparison, and this is not really useful in business. String comparison require: (a) partial matching (SUB STRING), and (b) similar matching (LIKE). Numerical comparison require inequality matching (>,<,≥, ≤).

The right index for Column Encryption

Strong encryption require dynamic Initialization Vector to enhanced security and avoid encryption method becoming deterministic function. Therefore exact matching index is out of the scope, and we should have a better approach on how to develop an index mechanism to support column based encryption. Our Invention of Indexing has very special algorithm, it can be used for LIKE search and therefore it answers the need of business. The beauty of this index is:

  1. It doesn’t reveal any confidential information inside the index file.
  2. It didn’t require any decryption computing when employed.
  3. In operational, it has very small overhead, roughly only 2%.

This killer feature has been awaited by most business users, only time will tell when this product will be used all around the world.

11DB/Postgres has a new name

11DB has been evolved lately, now it has very strong and secure encryption feature as explained above. This column encryption with our invention index mechanism will the best answer for Data At-rest protection against any Data At-rest attack such as: Data Center breach, physical security breach, internal access fraud, data maintenance fraud, and many others. This Encryption Feature and its Indexing called as: ESE, as shorthand of Equnix Seamless Encryption. 11DB/Postgres now called as 11DB/Postgres Enterprise, and 11DB/Postgres EnterpriseX which contains ESE, including: ESEC, and ESEN, etc. So, 11DB/Postgres becoming two version in which EnterpriseX has full fledged feature: ESE, Caraka, etc.

Equnix Seamless Encryption

Actually, this feature is not only securing Data At-rest, but also securing Data In-use in higher degree, since the only opportunity to reveal the plain data is when the column is being SELECT. This feature called as seamless because Developer or DBA doesn’t need to bother about encryption key, and to encrypt/decrypt is seamless, they are not require to use any function. ESE uses separated Key Management, and therefore there is no possibility to expose data leaks caused by missed management of the encryption key. Key management employed by ESE is relying on HSM (Hardware Security Module), Online HSM, or some trivial key management such TPM and Local Key for demo or testing.

ESE is also strengthened by EPL (Equnix Privilege Level) which consist of set of specific role being used by ESE to enable hierarchy of privileged access. Data which written by role with EPL2 will only be read by role with EPL2 or higher (EPL3, EPL4, …). Every user (which defined in the connection string) which connect to the Database, should have EPLx role so as to enable ESE.

ESE implementation can use EPL hierarchy of access, or just simple make all users has the same EPL role for enabling ESE without any privileges access, very simple. The lowest role of EPL is EPL1 and the highest role of EPL is EPL5.

Session based Connection String

Accessing Database from Application is very easy, Developer only need to have a connection string or some certificate, and then it connects seamlessly and somehow we can connect to database server multiple times without any hassle. Multiple connection being established by application is prepared for creation of connection pooling. Usually application uses username and password in the connection string, and therefore it would be very easy for attacker when they already breach the application server (which usually the weakest point of penetration is Application’s OS Host) to get into database, as simple as mimicking on how application connect to the database.

Single Sign On (SSO) and any kind of certification is useless when the attacker has gain administration access to the Application Host. with a small python script or any scripting provided we can iterate the process and gain access to the database as if the application is accessing the database. To get more secure in accessing Database, Equnix also creates a session mechanism in connecting string, therefore there is no possibility exist for any attacker to breach into Database Server even when they already beach the application host.

ESE, EPL and Caraka is the right answer for your database security needs

That is right, actually having only encryption is not enough, we should be more and more complement support and mitigation in order to complete the solution and prevention any breaches. Data Encryption, with the Indexes, Access is secured using granular of privileged access for each rights and users.

5 Nov 2024

Julyanto Sutandang

--

--

Julyanto Sutandang
Julyanto Sutandang

Written by Julyanto Sutandang

Technopreneur, System Developer, PostgreSQL Expert, run IT Solution for High Performance System, call me: +628111188812 (julyanto@equnix.asia)

Responses (2)