I am in the process of designing a system where there are a lot of objects and there are several workers that produce some result of that object. Finally, there is a special / unique worker (a kind of "sink", in terms of graph theory) who takes all the results and processes them to some final object that is written in some DB.
It is possible that a worker depends on the result of other workers.
Now, I am facing several problems:
- It could be that one worker is much slower than another. How do you deal with that? Adding more workers (= scale) of the slower type? (maybe dynamically)
- Suppose that W_B depends on W_A. If W_B is inactive for some reason, the flow will stop and the system will stop working. So I would like the system to avoid this worker, in some way.
- Also, how does the final worker decide when to operate in the result set? Suppose you have the results of A and B, but you do not have the result of C. C may be down or very slow at this time. How can you make a decision?
It is worth mentioning that it is not a real-time application, but an offline processing system (that is, you can access the database and modify a record), but at the same time, you have to deal with a relatively large amount. Large of objects in a "high rhythm". "
In terms of technologies,
I am developing the system with Java but I am not limited to a specific technology.
I would be glad if you could help me with:
- General design of the system.
- Understanding how the design supports the problems I presented previously
- What technologies can be used to implement this design.