Results 1 to 10 of 10

Thread: Safe pause/resume file inbound channel adapter

  1. #1
    Join Date
    Jul 2010
    Location
    Midland, NC, USA
    Posts
    22

    Default Safe pause/resume file inbound channel adapter

    This has been discussed before in this forum, but briefly, some time ago, and I have concerns.

    I went part way down this road in an "orderly shutdown" discussion here:

    http://forum.springsource.org/showth...ecutor-is-idle

    My concern is the caveat noted in that thread by Gary Russell. Namely, that just stopping the adapter could lose in-flight messages. For a shutdown, there was quite a bit of code to do this safely.

    Does that same caveat apply to a simpler pause/resume?

    More about my use case: I need to implement daily processing windows. For example, the service will accept incoming files from noon to 2pm, but at 2:01, we need to pause until tomorrow at noon, even if there are a thousand files in the inbound folder. And we must not lose files.

    I know how to call stop/start on the inbound channel adapter. The question is: is this safe? Can I be certain that in-flight messages will complete and the context will go idle until my next start()? Or does Gary's caveat apply here as well, such that the safe approach is to shut down schedulers and executors before stopping the adapter?

    And, if so, is this then complex to start back up?

    One possibility is that I shut down the entire spring context and restart it the next day. But if there's a better way...

    Thanks,

    Fred

  2. #2
    Join Date
    Jan 2008
    Location
    Mohnton, PA USA (that's near Philadelphia)
    Posts
    2,148

    Default

    If you only need to poll for the files at 2PM every day why do you need to pause a poller? Why not just set a cron trigger to poll every day at 2PM?

  3. #3
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,028

    Default

    Yes, you can use a cron trigger to run, say every n seconds between noon and 2pm.

    However, let me clarify my earlier points about orderly shutdown of a polled message source.

    When you stop() a SourcePollingChannelAdapter (such as a file:inbound-channel-adapter), we cancel the scheduled task (ScheduledFuture) with "mayInterruptIfRunning" true. This means that any interruptible code in the message source (including custom filters) will be interrupted. This doesn't mean any messages will be lost (as long as you use an external task executor), because they havent't been dispatched yet, but it could cause issues, depending on the message source.

    To avoid this, you can use your own TaskScheduler (e.g. ThreadPoolTaskScheduler) with 'waitForTasksToCompleteOnShutdown' set to true.

    Calling shutdown on this scheduler will also cancel the ScheduledFuture, but it won't interrupt the thread if it's running.

    You can then wait for its 'scheduledExecutor' to stop (test its isShutdown() property).

    Finally, you can then safely call the adapter's stop() method because there is no longer a ScheduledFuture to cancel.


    To clarify further - there are two Task* entities involved. The TaskScheduler schedules a poll (based on the trigger) and a TaskExecutor which sends any messages resulting from the poll.

    By default, we use an automatically created TaskScheduler (bean name "taskScheduler") and a SyncTaskExecutor (which means the message send occurs on the scheduler's thread).

    Although it's not supported by the namespace, you can call setTaskScheduler() on the channel adapter to use a custom scheduler instead of the default (or simply replace the default "taskScheduler" bean - if you don't mind stopping all scheduled tasks). If you inject your own scheduler, it's best to set auto-startup to false and start() it manually after injecting the scheduler.

    You can also inject your own task executor on the poller (that is supported by the namespace). This means the scheduler will hand off to that executor meaning in-flight messages won't be affected, and stopping the scheduler won't have to wait for all in-flight messages to complete processing.

    I hope that makes things clearer.

    The 2.2. work was specifically to provide an orderly shutdown of the entire context but, after writing this post, I can see that it would be nice to provide a built-in option to stop individual adapters in an orderly fashion. Please feel free to open a JIRA.
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  4. #4
    Join Date
    Jul 2010
    Location
    Midland, NC, USA
    Posts
    22

    Default

    First, to answer Oleg's question: We have thousands of files appearing continuously during a 12 hour period, not just all at once at 2pm. So we have to process continuously during that 12 hour period, and then explicitly stop processing for the next 12 hours. Again, safely, without loss.

    I'll read up on cron triggers. Can I manage my processing window and my blackout window completely with cron triggers? How would I handle overlaps? Say I set a cron trigger to run every 5 minutes. At poll 1 there are 1000 files and I start processing them. At poll 2, 5 minutes later, there are 2000 files. Does my application know or care that the first 1000 may or may not be complete? Or do I have to manage that myself? For example, at poll 1 I immediately move the first 1000 files somewhere so that I'm sure the next poll gets only new files? Or is a poller triggered by cron smart enough to message a file once and only once?

    Gary, I think I understand the complexity of getting the stop/shutdown right (though I'll read your reply another 10 times or so...), but you didn't mention starting back up. I'm guessing that's equally complex? If so, then I am leaning more and more to complete shutdowns and restarts of the entire application (or at least the context), rather than trying to manage pause/resume within a running context.

    As always, thanks to you both for your advice.

    Fred

  5. #5
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,028

    Default

    Good point - you can't restart a scheduler, so replacing the global 'taskScheduler' bean won't work.

    So, to start...

    Inject a new task scheduler (with waitForTasksToCompleteOnShutdown = true)
    adapter.start()

    to stop...

    shutdown the scheduler
    wait for its scheduledExecutor to stop
    adapter.stop()

    Not really as complex as it sounded above (just be sure to set auto-start to false).


    Regarding overlapping cron polls - simply use a scheduler with a pool size of 1. Yes, you can create a cron expression that will run every 5 minutes between noon and midnight every day (or Mon-Fri, or whatever).
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  6. #6
    Join Date
    Jul 2010
    Location
    Midland, NC, USA
    Posts
    22

    Default

    Okay, I think I understand this well enough to experiment (the stop/start, not the cron approach). Maybe I can post some code and config.

    But I'm a bit confused as to what can be done in config and what has to be in code.

    For example, I can inject, in config, the appropriate task scheduler (without auto start), and then start it in code.

    I stop everything in code.

    But when I want to start back up, I'm guessing I need to create another new scheduler in code and inject?

    And following on your cron point: Do I understand this correctly? By using a pool size of 1, we ensure that only a single thread is polling. This means that polls will never overlap because poll #2 will never happen because poll #1 is still active?

    Thanks again,

    Fred

  7. #7
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,028

    Default

    There is no way to inject a scheduler in config so you will set auto-start to false, always inject a new scheduler each time you start.

    However, using a cron expression you don't need to stop/start at all, just create the appropriate cron expression.

    I was mistaken about the thread pooling and overlapping polls; the cron expression will only be evaluated when the current poll ends. You don't need to worry about them overlapping. If your cron expression says run once a minute (within your schedule) and a poll takes 59 seconds; the next poll will be 1 second later. If the poll takes 70 seconds, the next poll will run 50 seconds later (you'll miss a poll).
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  8. #8
    Join Date
    Jul 2010
    Location
    Midland, NC, USA
    Posts
    22

    Default

    Do you happen to know if stopping and starting the file inbound adapter will reset the Set(?) that supports the 'prevent-duplicates' option? We do expect duplicates so we need prevent-duplicates="true"

    To summarize, we've been discussing these 3 approaches to my problem:

    1. - Start and stop the entire application as needed. Pros: It's simple. Getting "orderly shutdown" right is the only real concern. No cons that I can see.

    2. - Start and stop the adapter as needed. Possible problem: the prevent duplicate set may not reset.

    3. - Cron trigger - In this case the prevent duplicate set would grow without bounds unless we did periodic restarts (which puts us back to #1).

    For now I'm working toward #1. But I have more questions brewing! I'll start another thread.

    Thanks again.

    Fred

  9. #9
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,028

    Default

    Stopping and starting the adapter will have no impact on the filters (which is where the prevent duplicates is implemented). Note that prevent duplicates doesn't survive a JVM (or context) restart.

    2.2. Has a new feature whereby you can specify on-success-expression or on-failure-expression, which can be used to manipulate the file after processing, for example "payload.delete()" or "payload.renameTo('/processed/' +payload.name)" or "payload.renameTo('/failed/' +payload.name)".
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  10. #10
    Join Date
    Jul 2010
    Location
    Midland, NC, USA
    Posts
    22

    Default

    Understood.

    This reinforces my current thinking. I need duplicate prevention on small time scales only, as the upstream delivery mechanism has a bad habit of delivering duplicates. So I'm not concerned about a duplicate between Monday and Tuesday (I can detect this elsewhere), but I am concerned about a duplicate between 8am and 8:05 am. The daily application restart gives me the best of both worlds. I get duplicate prevention during the processing window, and I reset the duplicate prevention filter on restart so that it doesn't grow forever.

    Thanks again, and I'll look forward to 2.2.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •