Page 2 of 2 FirstFirst 12
Results 11 to 19 of 19

Thread: Anyone using Spring-Batch & Grids (Terracotta, OpenSpaces, Memcache, Gigaspaces)

  1. #11

    Smile Thanks ikarzali!

    Yes - this is along the lines of what I wanted.

    A couple of questions:
    1. What did the database tier look like? i.e. With an 80% hit ratio, how busy was the system of record?

    2. Was a partitioning strategy used? i.e. One state-of-the-union per JVM, and some sort of routing happening in front for all requests.

    Thanks.

  2. #12
    Join Date
    May 2007
    Posts
    15

    Default More about the use case

    Quote Originally Posted by ndefreitas View Post
    Yes - this is along the lines of what I wanted.

    A couple of questions:
    1. What did the database tier look like? i.e. With an 80% hit ratio, how busy was the system of record?
    The data tier did not include a database. It was a data service. The system of record can handle 100% of the workload in this use case but the customer didn't want any hits to the system of record for other reasons.

    To more indirectly answer your question, I just got off the phone with a customer who said they just tested Terracotta 2.5.2 against their next release. Their DB utilization (95th %ile) was 70% without us, and spike to 95%. With Terracotta the utilization was 6% and never passed 8% at spikes. This is a 12CPU Oracle box underneath a 17 node Java cluster.

    Quote Originally Posted by ndefreitas View Post
    2. Was a partitioning strategy used? i.e. One state-of-the-union per JVM, and some sort of routing happening in front for all requests.

    Thanks.

    Good question. The Java nodes all get random cache lookups from the network and then do a map.get(). That map is partitioned transparently underneath. The customer decides how many partitions they want and TC partitions the key-space in the map transparently. In this use case, they are debating right now but production will be somewhere between 2 and 4 partitions. This means that there will be 2 - 4 Terracotta servers, one for each transparent partition.

  3. #13
    Join Date
    Feb 2008
    Posts
    5

    Default

    Hi Nigel

    I think the best approach here would be to test things for yourself and decide. I wouldn't take any vendor's word when it comes to performance
    Not sure to which "nearest competitor" ikarzali is referring, but in any case I suspect that what may be good for one application will not necessarily be the same for another.
    It would be helpful if you can shed some more light about your application.
    As a side note, GigaSpaces is bundled with an open source embedded benchmarking tool which you can easily use to test performance and see for yourself (it's documented here - http://www.gigaspaces.com/wiki/displ...Spaces+Browser).
    We feel pretty confident about our performance and are certainly willing to work with you to make sure to get the maximum performance for your use case.

    Uri
    Last edited by uri1803; Mar 10th, 2008 at 04:21 PM.

  4. #14

    Default

    Thanks Uri,

    Quote Originally Posted by uri1803 View Post
    It would be helpful if you can shed some more light about your application.
    The application is a simple Java JAR that executes some SQL against a database containing data from ~50 states. Each request returns metadata relevant to an address (the request). There are approx 1.3 to 1.5 million records per state that equate to ~300 MB of space on disk.

    The goal is to send at least 1 million addresses through the system in a 14 hr batch window. This equates to ~20 transactions per second.

    Nigel

  5. #15
    Join Date
    May 2007
    Posts
    15

    Default Should work

    I agree w/ Uri...try anything and everything you have time for. Get a feel. Not trying to preempt your decision in any way.

    But FWIW, I am pretty sure Terracotta can deliver that throughput you need.

    One other metric that would be interesting is your object graph shape / read/write ratio. Meaning do you have 10K, 100K, or 10MB objects? And do those objects change? At what rate (50% write, 25% write, etc.)?

    Anyways, good luck to you! And make sure to use our forums at http://forums.terracotta.org/ if you need help (not to suggest you should stop using Spring's forums for whatever you need...but you will get better response times regarding Terracotta questions if you use Terracotta's forums.)

    Cheers,

    --Ari

  6. #16

    Default

    thanks ikarzali.

    One other metric that would be interesting is your object graph shape / read/write ratio. Meaning do you have 10K, 100K, or 10MB objects? And do those objects change? At what rate (50% write, 25% write, etc.)?
    I'm not too worried about the writes on this application - they're all File IO writes, and on zOS those go pretty fast. It's the reading of the database and cache priming I'd like to know more about.

    I suspect that each unique initial request will fetch data form the DB, and TC will then store it in it's cache. Subsequent requests for the same data will be pulled from cache. Is this correct? Does TC handle the caching transparently based on service requests?

    I think the objects will all be less than 1k each - it's really just an address with two or three more attributes. That's the extent of the transfer objects. On response there's a bit more data, but still less than 2k each.

    Thanks.

  7. #17

    Default

    Any update on Spring Batch with GridGain.
    Please share more on this.

    Thanks

  8. #18
    Join Date
    Jun 2005
    Posts
    4,232

    Default

    GridGain is an adequate platform for implementing the PartitionHandler SPI (see user guide for more details). The nice thing about it is the network classloading (so no need to deploy jar files in remote workers).

  9. #19

    Default GigaSpaces Spring Batch PU

    You may find a running example here:
    http://www.gigaspaces.com/wiki/displ...pring+Batch+PU

    The GiagSpaces Spring Batch PU provides:
    Enhanced performance:
    - Distributed parallel processing.
    - Distributed Task execution partitioning.
    - In-memory distributed state management.

    Management and Monitoring:
    - Task execution queuing.
    - Distributed Deployment environment.
    - Continuous High-Availability.

    Scalability
    - Elastic and Dynamic scalability of the Spring batch PU instances.

    Shay Hassidim
    GigaSpaces

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •