Why In-Memory Database is not for RDBMS?

Julyanto Sutandang
5 min readOct 9, 2022

RDBMS is a shorthand of Relational Database Management System, which by definition it should follow ACID Principle: Atomicity, Consistency, Isolation and Durability.

In the other hand, In-Memory Database is relate to a data management system, which can be any kind of database, not only referring to a RDBMS. In-memory here refers to a data processing which will going around only (mostly?) in memory, and therefore it should be very fast.

There are so many product which claims to be In-Memory Database with their own intention and actually has directing different perspective regarding their aim and its product characteristic.

The real intention of having a jargon In-Memory thing is an understanding that if every processing is only happen in memory, not necessarily going into I/O (Disk especially, or Network is another mitigation) then the system would be super-fast like a spreading of sophistry nowadays. Since everybody understand that even PCI-Express speed is way below the memory speed. So, anything below memory is way to slow to cope with.

Addressing In-Memory and Persistence is an appetizer of why RDBMS is not appropriate to be called as In-Memory Thing. When something gets called as In-Memory, then it shouldn’t or need to be persistent, because when it becomes persistent, it loses its title as the fastest one.

Why? simple: one should only choose 1 of the 2, cannot be both. At least at this time the article is writing, there is no technology ready to enable that at the same time: keeping data manipulation in memory, while writes to disk for every update, and that operation is done independently. Unfortunately, at this time, those cannot be done in very short time, there is a lag/latency and that violates the persistence.

Time is the most precious factor, well. Indeed it is precious all the time.

Photo by Icons8 Team on Unsplash

The main reason this article is made is there are always more people being intrigue and tempted by the irresponsible sales which tells those can be happen at the same time, or at least The User didn’t really understand the risk at the time of buying.

So, What is the actual reason on why both of them cannot be happen at the same time? The Rule is simple: When it require persist, then it should write. Write to a persistent storage, the fastest one right now is still way slower about 4 up to 10 times than memory. It will get slower along with the data grows for Insert or Update. More update, drags more for them to catch up: getting slower, since to persist require writing to disk for every update.

So, how about not really persistent? or late persistent? Then it is not persistent. When data is not updated to the last changes, their discrepancy of actuality is a big stake of business. Most business cannot afford discrepancy of data other than disaster. Therefore, well said: Data Discrepancy is a Disaster.

Then why in-memory thing is not able to be happen in RDBMS? Because at least there are 3 reasons:

1. RDBMS as a compliant system to ACID principle needs to be Durable, therefore should be PERSISTENT

As already explained above, when RDBMS guard their operation to persist the Data, their would be doing the best action in the world: running all the way down to disk for every commit. So, when there is a commit, there is a flush (in a very short sequence after commit). By logging the transaction and writing to a super fast NVMe disk, it save a lot time of process, and still achieve persistence.

Our test shows, in a single machine we still able to reach more than 150,000 SQL Transactions per Second with only a single x86 Machine. This is real fast some people might think this uses In-Memory Thing, which is not. This is the real RDBMS operation with a great care of tuning and maintenance.

PostgreSQL’s performance tested in every fastest CPU in their time

2. If by only if in memory transient data manipulation can be considered as In-Memory Thing, then All RDBMS is In-Memory Database by default without any tuning or hacking

Perhaps it is a joke, let say it is a joke. But it is true.

At least for the most widely used RDBMS Architecture: PostgreSQL, Oracle, etc. They use Share Buffer Pool, to store bunches pages of Data as it is memory cache. Every operation actually going there in the memory, until it need to commit. When commit required, then WAL (Write Ahead Log: PostgreSQL) will take place to ensure persistence by writing and flushing (avoid kernel write caching). Another round if there is another process require those pages of data, they should only take it from the memory pool, not necessary going to the Disk.

By those explanation, and by some shallow definition, perhaps it can be considered. Anyway, for the sake of consistence of meaning, it should NOT be considered.

3. The only possible RDBMS can have both of in-memory and persistence, is running on Fault-Tolerant System.

Unfortunately, Fault Tolerant System which claims to have 99.99% of availability has never proven well all the time. There are some risks which we never able to get guarantee that its overcome: Cosmic Ray, Earth Quake, Electric Spike, Short circuit inside a system, etc.

All in all, Database is not only RDBMS, therefore claiming In-Memory for Data management other than RDBMS is a correct term. It can be apply for Redis, for example. Redis is also suffer when we enable their persistence, still complying the above principle: we cannot get both, choose one of the two.

But, we should also remember than any Data Management which being used in business require ACID principle, their Atomicity, and Isolation, and Durability and last but not least its Consistency is everything we need to able to cope with the business needs in ever growing bigger market every time. Concurrency control which complying ACID is all their need.

It has been said, it is up to us to choose. But at least no more hype and misconduct from sales person. Feel free to discuss further with me, please chat with me: julyanto[at]equnix.asia

May all being always be happy

--

--

Julyanto Sutandang

Technopreneur, System Developer, System Optimization Expert, Open Source Enthusiasts, PostgreSQL Expert, run IT Solution Business in South East Asian