Is it wise to use kafka as the "truth source" for mission critical data?
The configuration is:
- Kafka is the underlying truth source for the data.
-the consultation is done in caches (ie, Redis, ktables) hydrated kafka
- Kafka configured for durability (infinite subject retention, 3+ replication factor, etc.)
- the architecture follows the CQRS pattern (write in kafka, read from the caches)
- The architecture allows an eventual consistency between readings and writings.
We are not allowed to lose data under any circumstances.
In theory, replication must guarantee durability and resistance. The confluents stimulate the previous pattern.
The only flaws I can think of are:
- the cache explodes and must be rehydrated from scratch -> query
- the agent's disk is deleted / corrupted -> the kafka rebalancing, resulting in a prolonged downtime if the themes contain mountains of data
Has anyone run and tested in battle this type of configuration in production? That is to say. Did you find damage to the disk, the runners went down, but still retain data?
My intuition is that this is a bad configuration, because kafka was not originally designed for the durability levels of RMDBS, but it can not point to a specific reason why this would be the case.