We had some event processing being done with a cron job calling to a web API, but it turns out we need to scale this out, way out, because our event message quantity is growing, and the web API we have been using is reserved for realtime execution use cases. So we set it up with Amazon SQS queue being read into an AWS Lambda function, which scales out the processing. The problem that we’re running into is that the processing is not very small, there are seven or so tasks that processing entails including integration with S3, ElasticSearch, parsing and mutating data, custom web API invocations, SMTP/email, etc. Since these integration points involve related integration dependencies, the payload size just for deployment in a .zip file is more than 50MB, which exceeds the maximum size allowed for Lambdas. If deployed, it would take about 40 seconds to set up each instance and again as much to complete each message. Lambdas, we are reminded, are intended for really small, lightweight, “do it and quit quick” type of tasks, not moderately heavy handed processing, so I suppose Lambda was not the right choice for us.
It occurred to me that I could have the Lambda call the web API again, or call a new web API app just for this purpose. But then that web API would end up needing to scale up, not out. We want to automatically scale out, which Lambda gave us.
We know we could also chunk the processing into multiple stages so that Lambda is indeed staying simple, there would just be multiple Lambdas instead of one, but that would also mean more SQS/SNS queues/topics, and I am trying to avoid building a Rube Goldberg machine.
There are so many options out there that we can look at for running this processor but unfortunately so many of them seem like they will involve a great deal of hand-holding just for infrastructure/architecture setup, things like kubernetes (containerized packages), EC2 (VMs), etc. What is the best strategy for this problem?