We are designing a distributed system capable of processing large scale amount of BLOB data. Each node should load its own portion of data in a high speed way and is responsible for processing its own data.
Due to network limitations it is better that each node does not burden network bandwidth too much. Practically it would be better for each processing node to be a part of a database cluster and loads data from its own node which is a member of the data cluster. Therefore, it is better for this database system to work with minimal memory.
It is mandatory for the database to be responsible for managing the shards and replicas, and adding or removing nodes to it should be done with minimum cost.
What is the suitable database system for this design?