Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 22

Thread: Clustering with Spring - best practices?

  1. #11
    Join Date
    Aug 2004
    Posts
    1,905

    Default

    Quote Originally Posted by Patrick Angeles View Post
    Sorry, should have been clearer. I meant cluster in the same sense that you would cluster Stateless EJBs. There is no bean state to replicate, but you have load balancing and client-failover across multiple nodes.
    But *why* would you want to cluster stateless beans? You can get failover and load balancing without clustering....clustering is really only for sharing state across nodes.

    Think about it
    Colin Yates
    SpringSource - http://www.springsource.com - Spring Training, Consulting, and Support - "From the Source"
    Please read http://www.springframework.org/documentation
    Co-Author of Expert Spring MVC + Web Flow.

  2. #12
    Join Date
    Aug 2006
    Posts
    382

    Default What is your definition of cluster?

    Quote Originally Posted by yatesco View Post
    But *why* would you want to cluster stateless beans? You can get failover and load balancing without clustering....clustering is really only for sharing state across nodes.

    Think about it
    I read this off wikipedia: "A computer cluster is a group of tightly coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer."

    I agree with this definition. I don't agree that clustering is only for sharing state across nodes. To me, if I need high availability of servers, that means clustering. Now, everyone does clustering a little differently, so you have to evaluate what the product you are considering actually offers. Does it provide the most efficient, and hopefully least invasive solution? I think Spring makes it easy to plugin such a solution without changing your POJOs. http://www.javaworld.com/javaworld/j...31-spring.html demonstrates this for a stateless service. You are not going to tell me that having multiple servers, providing the same stateless service, is bad, are you?
    Greg L. Turnquist (@gregturn), SpringSource/VMware
    Project Lead: Spring Python and author of Spring Python 1.1 and Python Testing Cookbook.
    Listen to Pond Jumpers, the international podcast for open source developers.
    These comments are my own personal opinions, and do not reflect those of my company.

  3. #13
    Join Date
    Aug 2004
    Posts
    1,905

    Default

    Yes, you are of course correct. I wasn't being clear.

    My point was that the load balancing can happen outside of the application, i.e. with Apache or a hardware load balancer. Each "node" can work independently from the others as there is no state being replicated, hence no server affinity.

    If the web tier state is replicated, then making the middle tier cluster aware has little or no advantage.


    From what you say, I see no problem deploying N instances (with the web tier replicating state) and then having Apache distributing the calls (round robin etc.)


    The question that I was asking, is exactly what benefit do you get about introducing clustering into the middle tier itself?

    You can of course introduce a Proxy instead of the POJO which is cluster aware (using RMI maybe) but I see no benefit, only complexity....
    Colin Yates
    SpringSource - http://www.springsource.com - Spring Training, Consulting, and Support - "From the Source"
    Please read http://www.springframework.org/documentation
    Co-Author of Expert Spring MVC + Web Flow.

  4. #14
    Join Date
    Feb 2007
    Location
    Moscow, Russia
    Posts
    56

    Default

    Quote Originally Posted by gregturn View Post
    You are not going to tell me that having multiple servers, providing the same stateless service, is bad, are you?
    Except that overhead of remote calls will eliminate any benefit you can achieve from this solution very quickly.

  5. #15
    Join Date
    Feb 2006
    Posts
    26

    Default

    Quote Originally Posted by gregturn View Post
    You are not going to tell me that having multiple servers, providing the same stateless service, is bad, are you?
    You guys are dancing around a few different concepts I think... I don't think that going through a load balancer would decrease the performance of an application and of a stateless business layer (assuming that there's already a physical business layer), nevertheless the first rule of distributed computing applies: Don't distribute your objects (unless you have to)! Obviously failover is a good enough reason to have a cold or hot backup server but that doesn't mean that a physical business layer is always the best thing.

    - Yagiz -
    http://blog.decaresystems.ie (Shameless "company blog link" )

  6. #16
    Join Date
    Dec 2005
    Posts
    16

    Default

    To bring several posts together...

    For HA, use a load balancer in front of your incoming requests (be it web services or browsers). Co-locate your spring services with your web server (tomcat, resin, whatever), keep them stateless, and cluster the pair. That way you can easily survive the loss of a machine.

    When you get serious about HA (5 9's), dual-path everything. Your load balancer gets a hot spare. Your internet connection becomes 2+ through different companies with as different traceroutes as possible. You dual nic everything through different switches. Your database gets clustered as well (Oracle RAC or DB2 EEE). Leave nothing to be a single point of failure.

  7. #17

    Default

    Having your Spring services be stateless is absolutely the way to go. There are always other issues that crop up with clustering. Such as, if I'm using Hibernate I want to take advantage of the 2nd level cache as much as possible to avoid pounding the db server with the same request over and over again. But if each machine has it's own cache, how do I handle updates to objects?

    One solution might be to have a entries in the cache be invalidated after so much time so they will be refetched the next time they're used. If you don't need to see real time updates as they occur this could work. Or you could use JMS so communicate among the nodes and publish events. When an event is caught saying an object was updated you could invalidate its entry in the local cache. You could use Tangosol or Terracotta as they provide distributed caches.

    Another problem is if you're using Quartz for jobs. If you've got a job you want to run every night at 12pm, how do you make sure that only one of the nodes does the job? In the same node, you might also want to have scheduled jobs that run on all the nodes. How do you accomplish that? I think Quartz can use a database for storing job information, and I know I've seen references to using Tangosol or Terracotta for accomplishing this sort of thing.

    One last problem that I can think of is if you're using Lucene to provide search capabilities in your application. You could use a database backed Directory, but this has performance problems. This is another area where you could have indexes on each machine and use some communication method (like JMS, Tangosol, or Terracotta) to tell the others when an object is added/updated and needs to be (re)indexed.

    These are all problems and some ideas I've seen floated around before. I've never actually heard any "we wanted to implement a cluster and here's how we did it" case studies. I'd be very interested to see someone talk about how they actually solved these problems and any additional hurdles they had to overcome.

    Rich

    PS: When I say cluster, I just mean a group of servers all running the same application with minimal communication between them. Ideally they wouldn't have to have any communication, but because of the issues I've highlighted above that doesn't seem to always be possible. I guess it would be better to call it a "farm" rather than a cluster.

  8. #18
    Join Date
    Jun 2007
    Location
    Kharkov, Ukraine
    Posts
    8

    Default

    Rich,
    Yes, you are correct - the problem with scheduled tasks really occurs if there are several servers in cluster (for example - email sending, scheduled generation of some documents, file system monitoring etc).

    We solved that problem is quite a straighforward way - all servers (in the system where cluster was used) were communicated with shared database. And to eliminate tasks duplications, each task before execution locked appropriate record (by writing ID of server and time of locking) - so other tasks were able to check that lock and execute it only if no lock exists.

    Caching - yes, another issue. We've used some caching on servers, but synchronized them via distributed cache.

    Actually, I've developed Cluster4Spring project because at that moment there were no ready to use clustering solution for Spring and it was required to cluster existing system. Yes, in most cases we've used stateless services and therefore clustering of system we've developed was performed mostly by correcting appropriate XML mapping.

    Of course, not very system could be easily clustered since there specific requirements that should be supported by system architecture (like stateless services). However, if the system is designed from scratch, it's possible to satisfy them.

    As for figures - configuration of real life cluster we've developed included servers of several types (based on server' purpose) - web servers, application servers, image processor, images generatin servers, pdf generation servers, uploads processors (not counting database servers).

    On production stage, we have
    8 web servers
    5 upload processing servers
    8 image processing
    3 pdf generation server
    7 images generation servers

    At the moment of writing, uptime of the system is more than 3 months.

    Regards,
    Andrew Sazonov

  9. #19
    Join Date
    Jul 2007
    Posts
    3

    Default

    I think i'm in the situation right now!

    We have 1 webserver and one db server, but our client want to scale up the site, to give more performance for software. The proposed solution is to set up a new webapp server and improve db server.

    Could you pls suggest the best way of implementing this?

    Of course i think about keeping applications in sync, cause data is stored in local cache. Just a few words about app: webwork - spring - hibernate.
    A prediction game with a lot of rankings and ways of grouping players.

    As far as i've understood - either use distibuted cache or send messages as a signal to clear the cache.
    correct? what would you suggest?

  10. #20
    Join Date
    Feb 2006
    Posts
    26

    Default

    My first question would be:
    - Currently, is there a performance problem?

    I guess the answer to this question is "yes".

    Then, the following question is "do you know where the bottleneck is?"

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •