architecture – Enforce collocation of data in the region where it is used


How to make sure that data specific to particular region remains in the closest data center to ensure low latency?

Lets consider Amazon e-commerce as an example . It sells products all over the world and not every product and product’s seller is available in every region. So there is no point in showing , lets say ABC speakers which are not sold in Australia, to customers in Europe.

So if the user in Australia wants to list all the speakers , a simple query where country='AUSTRALIA' will work ( in the simplest case)

Question 1:

Next comes how to resolve the latency part ( where my question is). How do we ensure that products sold in Australia are the only one that are present in Australian data center’s database. Because, if we fire the above query the partition ( or even the replica of the partition ) that carry the information about product =Speaker and country =Australia might be present in Japan.

As per my understanding, Amazon or such eCommerce will probably have elastic search DB cluster which is geographically spread and partitioning on key = country will not answer the question.

Question 2:

Is it a good idea to maintain separate database for each country to solve above issue?

This question even extends to Uber. Uber keeps track of all the rides that are available within all the regions of the world ( where Uber is actually available) in its Redis cluster. Now when a user wants to search for a ride in region-1 it will not be a good idea to send this request to USA because the partition that is handling the region of Australia is actually present in USA.

Can you please give some idea of how to make sure data is collocated with the region it is used in?