Jan 24th, 2007, 08:59 AM
Rules and Caching...
What is the best way to architect a solution based on these requirements:
We need to process 10-20 events per second - delivered via JMS infrastructure to application server (WebLogic or WebSphere). Each event needs to be scrutinized by some type rules engine to find 'matches' based on rules defined in a database. These rules will be dynamic - updated regularly throughout the day by customers via web-based GUI. As the messages flow, we need to observe their content and apply the current set of rules to find matches - if there is a match, then invoke some type of work-flow by sending a message to another Queue and go from there...
The key here is the apply the rules and find possible matches super fast. Doing a db query lookup for each event and then spinning through some type of structure is probably too slow... Would like the rules to be cached and the cache refreshed each time the rules change or based on time interval.
Jan 24th, 2007, 09:03 AM
You can obivously introduce a cache to prevent having to reload from the database. As for the actual rules, how complex are they? Are you planning to use a rules engine, or are you doing your own thing?
Jan 24th, 2007, 09:30 AM
Jan 24th, 2007, 10:42 AM
The rules themselves are NOT too complicated, they are combinations of several attributes and operands - for instance: IF business partner ID and Location A or Location B and not Location C and container Serial number range, and delivery date range, and EventType=(GateIn, or GateOut), then MATCH -> do work...
Originally Posted by karldmoore
But there will be LOTS of these rules (1800 partners, average of 8 rules each roughly about 15000 rules).
We'll need to make an architectural decision on weather or not we need a rules engine - again, the foremost consideration is the SPEED - that's of paramount importance. The current mainframe system that performs this process produces a 20ms response time to determine if an Even has a match - our solutions needs to meet or exceed that response time.
What would you recommend?
Jan 26th, 2007, 03:23 AM
I couldn't really recommend anything. Regardless of the chosen path I'm sure you could build a system that performs very well or very badly. I would do some research and some prototyping. I'd mock up the rules and data and do some profiling and testing. Unless you play with some technologies it's really hard to say if it would do what your want.
Jan 28th, 2007, 03:51 PM
Do you need only the first matching rule, or do you need to find all matching rules or you need to find any matching rule without precedence between the rules?
Originally Posted by jdepaul
Will the amount of rules expand without a limit, or will it remain about the same amount?
Obviously, the rules can be cached, and also, you could build up some aggregate data structures for certain rules which help very quick evaluation of matching rules, and you can also come up with a cost-based decision whether it is better to use the indexes, or you have so few candidates, that you can now evaluate the remaining candidates one by one.
E.g. a reverse mapping from checked values to rule ids (i.e. bitmap index) can help with equality checks.
An ordered version of the reverse mapping (where the Map is a SortedMap) can help with range or greater/less than checks.
The key thing is to be able to drop out rules which do NOT match as efficiently as possible. For this, a good approach is to consider the event as a rule, and the existing rules as a data.
You also should maintain a list of existing rule kinds for each attribute.
A very simple example.
Let's assume, an event arrives for which attribute A = 'X', and attribute B='200'. For attribute A, there are only equality checks. For attribute B, there are greater than or less than or range checks or equality.
Let's also assume for the sake of simplicity, that there is only AND relationship allowed between checks in a rule.
From rule candidates, which filter for attribute A, only those can remain, which require A='X'. The rest of those filtering for attribute A should be dropped away.
From those of the remaining rule candidates which filter for attribute B, only those can remain, which
- require equality to 200
- require greater than or equal to a number less than or equal to 200
- require greater than to a number less than to 200
- require less than or equal to a number greater than or equal to 200
- require less than to a number greater than to 200
- range checks should be treated as an AND relationship between the two boundary condition checks at rule parsing
The rest of those filtering for attribute B should be dropped away.
The rule candidates which remained after dropping away those described in the previous steps are those which have conditions the new event satisfies.
Of course, the real task is much more complex, considering you do not have only an AND relationship between checks in a rule, but dropping away rules with top-level checks in the expression AND-ed to the rest of the expression is a good first step to reduce the rules you need to fully evaluate.
And naturally, you might come up with a some clever solutions for OR-ed expression parts.
Also, if you can come up with a good order in which to check the reverse index of the AND-ed expressions, you can finish it much faster. E.g. equality checks first, in checks second, greater/less than third, etc, but you might also come up with another order according to your experiences and based on your data, and an order of the attributes by the amount of rules checking for that attribute (more and preferably more diverse checks for an attribute reduce the candidate-set much more effectively).
Jan 28th, 2007, 04:32 PM
Using JMS message selectors...
Thanks for the exhaustive reply and for ideas on a possible approach - this technique may be a way to go when 'rolling my own solution' and as you point out it may be tricky at best...
What do you all think of the following alternative approach to this problem:
As each rule is created via a GUI, that rule is converted into a unique message Listener (JMS listener) with a Message Selector that's qual to the rule in affect: thus each rule has a unique listener that is ONLY concerned with listening to those specific messages that EXACTLY match the rule criteria defined by the user... this way the JMS Provider does the message brokering and delivers to ONLY those concerned with that message that matches the specific rule and nothing else... I have yet to test it out, but what do you think of the approach - any limitation that you'd advise? Any gotchas?
Thanks in advance -
Jan 28th, 2007, 04:45 PM
Originally Posted by jdepaul
You might even try an attempt at evaluating all rules with boolean expression trees to see what performance is provided by a naive brute force approach of the expected platform.
The solution you suggested would mean that you depend on the JMS provider to have an optimized rule engine which does the exact same things which I mentioned, or that it has the sheer power to evaluate the rules in time.
Also, it may or may not have the effect you expect when looking at the number of rules which actually process the message compared with what you expect or their identity, depending on the capabilities of the JMS provider at defining a precedence order about the listeners.
It can be tried, it may be fast enough. But then, don't forget about your need to durably subscribe those listeners to your JMS destination, because otherwise you would lose the rules when the listener session terminates.
I am actually a bit wary of this, but on the other hand, message brokers have been doing similar things for decades now. They might have been properly optimized to this many listeners.
Last edited by robvarga; Jan 28th, 2007 at 04:49 PM.
Jan 28th, 2007, 05:05 PM
As an independent line of thought, it might be a good approach that the processing of the rules just forward the message to another destination or destinations with information about the relevant rules added to the new message or its headers, and register listeners for the different actions filtering by the newly added information.
Originally Posted by jdepaul
This way it might be easier to scale the processing power and licenses required to process the rules independently from the processing power and licenses required for actually handling the events after the proper action is determined.
Of course it really depends on your situation.
Last edited by robvarga; Jan 28th, 2007 at 05:11 PM.
Jan 29th, 2007, 07:56 PM
Think I'm going to give this one a try: http://www.jessrules.com/
Originally Posted by robvarga