I have a dilemma and I'm not sure which approach is better. I'd like to hear your advice.
Here's the flow:
1. Directory to poll files from
2. Multi-step process to validate and manipulate data read from files.
3. Queue validated (with no errors) files (using message store)
4. Poll queue (3) and upload files to third party system
Upload (4) process is much slower than directory polling (1).
In order for system to be prone to outages I decided to use message store backed queues. Unfortunately such queues require polling consumers. It's ok to have such poller in 4th step. But I don't like this approach in 2nd step. I'd like to have a queue after 1st step, and then event driven consumer in 2nd step. Is this feasible?
On the other hand - maybe it isn't worth to have a queue after 1st step? What concerns me, is the fact, that file-to-byte transformer removes files after transformation. So, in case of any error/outage in 2nd step, I will have file removed by transformer, and then its byte content lost because validation process failed.
Currently, I queue byte contents (so it's safe in case of any outages) but then I have to poll this queue. I'm afraid that polling every second is not clever when compared to traffic model (batches of 1000s of files every hour).