Page 1 of 2 12 LastLast
Results 1 to 10 of 19

Thread: Anyone using Spring-Batch & Grids (Terracotta, OpenSpaces, Memcache, Gigaspaces)

Hybrid View

  1. #1

    Default Anyone using Spring-Batch & Grids (Terracotta, OpenSpaces, Memcache, Gigaspaces)

    Has anyone tried to increase their Spring-Batch processing throughput using either of these:
    1. Terracotta: http://www.terracotta.org/
    2. OpenSpaces: http://www.openspaces.org/
    3. Memcached: http://www.danga.com/memcached/
    4. Gigaspaces: http://gigaspaces.com/pr_com.html

  2. #2
    Join Date
    Feb 2008
    Posts
    5

    Default OpenSpaces, GigaSpaces and Spring Batch

    Hi,

    It seems that there's a slight confusion here regarding the difference or relationship between GigaSpaces, OpenSpaces and OpenSpaces.org, so let me try to clarify it a bit:


    • The primary API (but not the only one) for this platform is called OpenSpaces. It is designed to enable scaling out of stateful applications in a simple way using Spring.

    • While GigaSpaces XAP core runtime is closed source, OpenSpaces is open source (running under the Apache 2.0 license).
      OpenSpaces.org is a community website, sponsored by GigaSpaces, with the objective of providing GigaSpaces user community with:

    • A mechanism for adding new features, functions, best practices and solutions on top of the core runtime with no dependency on GigaSpaces R&D team.
    • A central location for sharing these additions, and hopefully facilitating exchange of ideas (and code) among GigaSpaces users.

    As for Spring Batch integration, we have started an OpenSpaces.org project called GigaSpaces Implementation for Spring Batch. It's still in Concept phase since I decided to wait until version 1.0 of SpringBatch is released on March 20th (when the APIs are finalized an documentation becomes fully available).

    Feel free to contact me directly via email (uri@gigaspaces.com) for more information, or register to OpenSpaces.org and put a watch on the project homepage and Forum to get notified on changes.

    HTH,
    Uri
    Last edited by uri1803; Feb 28th, 2008 at 05:41 AM.

  3. #3

    Default Thanks Uri

    Thanks for clearing this up Uri!

    So, is the core API (OpenSpaces) usable (can I just install it, and create a grid) without the other XAP components?

    Also, the download link on the OpenSpaces.org site point's you back to download one of the three GigaSpaces products (but no OpenSpaces binaries): http://www.gigaspaces.com/os_downloads.html

    Which are:
    * GigaSpaces XAP 6.0.3 (Build 2040) - FREE full featured evaluation license
    * GigaSpaces XAP-EDG 6.0.3 (Build 2040) - FREE full featured evaluation license
    * GigaSpaces XAP-Community 6.0.3 (Build 2040) - FREE - limited functionality, unlimited license time

    Nigel

  4. #4
    Join Date
    Feb 2008
    Posts
    5

    Default

    The OpenSpaces API requires the XAP runtime. It's simply an abstraction layer on top of the core XAP runtime and not a runtime engine on its own (it is actually very similar to what Spring MVC or Struts are to a servlet container).

    OpenSpaces binaries and sources are part of the XAP distribution, so when you download any of the packages you mentioned you will get them (they're located under the distribution's lib/OpenSpaces directory).

    In the next few months we will also post the OpenSpaces code itself in OpenSpaces.org and enable community members to contribute to it. However this is not the case currently.

    Hope this clarifies things.

    Uri

  5. #5

    Default Thanks Uri

    Thanks - that clarifies things.

    While I have on the thread - what separates GigaSpaces from other solutions like Terracotta and Memcached.

    I'm just getting my toes wet with grids, and am not sure which stack is the best fit for different types of problems.

  6. #6
    Join Date
    Feb 2008
    Posts
    5

    Default

    It's a bit complicated to answer this in reply to a post, so it obviosuly won't be a comprehensive comparison. But I'll try to provide a high level overview (and please be aware that I am at the end of the day a GigaSpaces employee so I might be biased to an extent )

    memcached is a very simple solution for distributed data caching. It provides a Map like API and is not based on Java (although it has a Java API).
    Therefore it will always require your Java code to communicate with a separate process (memcached daemon).

    GigaSpaces and Terracotta are both pure Java solutions that provide much more than caching, although take different approaches at that.

    Terracotta is about clustering your application at the JVM level - i.e. taking any Java application and making it run on multiple JVMs with little effort or code change. So you get distributed data caching (Terracotta guys call it Network Attached Memory) and distributed processing via the JVM clustering. I must say I like this approach a lot, but at the end of the day, assuming every piece of your code is "clusterable", you're still left with the basic JDK APIs and need to implement a lot of stuff on your own (i.e. messaging, querying capabilities for you in memory data, etc).

    GigaSpaces product provides you with a comprehensive runtime platform for implementing highly scalable distributed applications. So the approach is to develop your application on top of a scalable platform from day 1 and not to cluster it ad-hoc for more scalability.
    The products integrates a very rich distributed caching implementation, messaging capabilities and a unique SLA-driven, self healing deployment platform to give your enterprise application all it needs to be grid-enabled.
    The OpenSpaces development framework utilizes Spring's dependency injection and its powerful abstractions such as remoting and transaction management to allow you to do all of this in an easy and battle-tested fashion, and also isolate your code as much as possible from the product-specific APIs.

    Obviously this is just the tip of the iceberg, I would advise you to have a look at each product's web site and give it a shot to see if it meets your needs.
    If you have an specific project in mind, we'd be more than happy to assist to test drive the product for it. You can download a fully functional evaluation version at http://www.gigaspaces.com/os_downloads.html.

    Uri

  7. #7
    Join Date
    May 2007
    Posts
    15

    Default The differences IMO

    I wrote a blog to answer your original question about the difference with Terracotta. Check it out. Let me know what you think:
    http://blog.terracottatech.com/2008/...dor_gob_1.html

  8. #8

    Smile Please share...

    Quote Originally Posted by ikarzali View Post
    I wrote a blog to answer your original question about the difference with Terracotta. Check it out. Let me know what you think:
    http://blog.terracottatech.com/2008/...dor_gob_1.html

    Thanks ikarzali for putting that blog entry together.

    Can you share with us the results of this test case:

    Quote Originally Posted by ikarzali View Post
    A customer last year tested for themselves and found that Terracotta-based clustering delivered 300 requests / second per app instance whereas the nearest competitor delivered 100 requests / second per instance given a certain load generating script. What was key, however is that the application under Terracotta was using 30% CPU whereas the competitor-based version was using 95%. Terracotta could be driven a full 3X faster leading to nearly 10X the throughput. Serialization / deserialization was the other vendor's bottleneck. What happened to "moving the compute to the data?"
    I don't want any of the other guys to think you're just spitting more jargon and gobbledygook

    We don't mind if you blot out the sensitive pieces.

  9. #9
    Join Date
    May 2007
    Posts
    15

    Default Test Case results

    Well, we are under NDA with the now-customer.

    I think we are working on documenting their use case, but for now I would be happy to scrub the results. Not sure what you mean by results but here's a bit more detail:

    1. 50 JVMs
    2. Receiving lookup requests from the network
    3. Looking up in system of record and then cache the results on a MISS
    4. return from cache on a HIT
    5. Cache hit rate about 80%
    6. Object sizes about 100KB
    7. desired transaction volumes == 20K lookups / sec, cluster-wide

    The end result:
    + Terracotta's lightweightness helped the customer uncover that the test-harness was eating most of the CPU. The competitor's framework was hiding that fact by eating most of the CPU itself. So, eventually the customer was able to drive the test to high CPU utilization with Terracotta.
    + 4 servers, each running 4 JVMs running on top of a single Terracotta server delivered 10K transactions per second at 90% utilization (625 tps per JVM).
    + Competitor: 8 servers, each running 8 JVMs, running peer-to-peer delivered 8K transactions per second at 95% utilization (125 tps per JVM).

    So if 20K tps was the goal, TC would require 32 machines (plus safety cushion) whereas the competitor would require 160 machines. (This is my recollection, at least)

    The entire test took about 1 week. Customer will be in production by end of the month.

    Does that help?

  10. #10

    Smile Thanks ikarzali!

    Yes - this is along the lines of what I wanted.

    A couple of questions:
    1. What did the database tier look like? i.e. With an 80% hit ratio, how busy was the system of record?

    2. Was a partitioning strategy used? i.e. One state-of-the-union per JVM, and some sort of routing happening in front for all requests.

    Thanks.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •