Musings on SOA Design

17 February 2010 Steve Núñez No comments

There’s a good reason that many software developers and system administrators prefer Unix (and Unix-like operating systems like Linux) over the alternatives. I recently came across a copy of Eric Steven Raymond’s book, The Art of Unix Programming, now available online at no cost. I used this book years ago when I was programming equity trading systems on Solaris workstations and found it very useful. While browsing the online version of the book, I happened upon the chapter Basics of the Unix Philosophy and was reminded of how well the Unix design philosophy, 20 years later, maps to good SOA design.

It’s worth repeating Eric’s opening explanation of what the Unix philosophy is and where it came from:

“The ‘Unix philosophy’ originated with Ken Thompson’s early meditations on how to design a small but capable operating system with a clean service interface. … The Unix philosophy is not a formal design method. It wasn’t handed down from the high fastnesses of theoretical computer science as a way to produce theoretically perfect software. Nor is it that perennial executive’s mirage, some way to magically extract innovative but reliable software on too short a deadline from unmotivated, badly managed, and underpaid programmers.”

How many large enterprise SOA (or software development) projects have come unglued after wasting untold millions? Quite a few in my personal experience, as well as others (see Tim Bray’s Doing it Wrong). In most of these cases, a huge up-front investment was made in an attempt to codify, document, over-design and cater to competing interests, while basic design principals (a ‘design philosophy’) overlooked. I’d like to propose the following high level design principals as the basis for good SOA design, borrowed and modified from Doug McIlroy:

  • Make each service do one thing well. To do a new job, build afresh rather than complicate old services by adding new features.
  • Expect the output of every service to become the input to another, as yet unknown, service. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats.
  • Design and build services to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them.
  • Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.

Basics of the Unix Philosophy is a great read for anyone designing services in a SOA. Nearly every rule is directly applicable to the process; in fact many require no ‘translation’ from the world of software design, where they were created, to the distributed world of SOA. Let’s look at each of the 17 rules and how they apply to good SOA design.

Rule of Modularity

Write simple parts connected by clean interfaces. This is probably the most important of the rules. It can be difficult, or required extra work, to create clean interfaces, especially when dealing with legacy systems, but it’s essential for ongoing maintenance. Eric writes:

“The only way to write complex software that won’t fall on its face is to hold its global complexity down — to build it out of simple parts connected by well-defined interfaces, so that most problems are local and you can have some hope of upgrading a part without breaking the whole.”

and what is a SOA except a very large, complex, distributed piece of software? In this light consider RESTful interfaces over SOAP/WSDL, unless WS-* is a requirement and the functionality can’t easily be built. There are many cases where the choice of one vs. the other are a requirement, in which case the choice is made for you, but when designing, favor the simpler.

It’s easy enough to design simple, clean components when in a greenfield project, but as soon as legacy systems enter the picture (and they almost certainly will), it gets a bit tougher. The orchestration layer is where this will happen, and it’s important to focus properly on this layer when designing: simple clean interfaces exposed to the enterprise, and the proprietary, ugly interfaces localised in legacy applications. This is where problems are most often seen, especially since SOA projects are chronically under- or incrementally funded. This is also where trial-and-error and throw-away prototypes are critical. Equally important is to have an end in mind: how/when/where will those legacy systems be replaced? If no one has an answer, make an assumption and plan accordingly.

Rule of Clarity

Clarity is better than cleverness. Code is written once and read many times. While not specifically related to SOA, as a general principal, making interfaces clear and simple reduces problems later. Does a service in the registry with the name eligibleCounterparty() determine if the counterparty is eligible, or make it eligible? Does REQUIRE-SSL=1 in a config file mean that it does or does not require SSL? Coding standards and guidelines don’t have to be draconian; get developer buy-in early.

Rule of Service (de)-Composition

Design services to be connected to other services. This is probably the most radical shift in thinking. Many people think of services as producers/consumers for applications only. This form of design limits the possibilities for reuse, and reduces the agility with which services can be recombined into new services or applications. If services are easy to connect to one another, new custom applications can be quickly created out of existing services. Imagine being able to string enterprise services together the same way you do Unix commands connected by pipes. Make I/O formats simple, and create services that provide simple data augmentation, transformation or filtering, as well as serving as ultimate sources/sinks of data.

Rule of Separation

Separate policy from mechanism; separate interfaces from engines. Service policies such as authentication, access control, etc. need to be implemented as a separate framework and not integrated into services. Policies change far more frequently than mechanisms, therefore service mechanisms should be implemented as first class objects in the enterprise. This can often be combined in a registry.

Rule of Simplicity & Rule of Parsimony

Design for simplicity; add complexity only where you must.

“Even more often (at least in the commercial software world) excessive complexity comes from project requirements that are based on the marketing fad of the month rather than the reality of what customers want or software can actually deliver. Many a good design has been smothered under marketing’s pile of “checklist features” — features that, often, no customer will ever use. And a vicious circle operates; the competition thinks it has to compete with chrome by adding more chrome. Pretty soon, massive bloat is the industry standard and everyone is using huge, buggy programs not even their developers can love.” — Eric Steven Raymond

It’s difficult to find many architects what would disagree with goal of simple service design, until they’re all in a room arguing for their pet interests. This is an aspect of SOA design that isn’t (usually) technical, but more often than not organizational and political. At an enterprise level having a strong, motivated team of leading (enterprise) architects, who have internalised good design principals is essential.

This rule is often violated by ‘creeping featureitis’, where the temptation is strong to add just a little bit more here or there to satisfy some requirement. One good way to counter these tendencies is by the rule of parsimony: Write a big program only when it is clear by demonstration that nothing else will do. If a solution architect is proposing a large, complicated service, the burden of proof is on them to demonstrate that this is the only way to meet the requirement. Minimize this tendency by making it easy to create new services; lower the overheads in administration, approval, and deployment of services for example. Clear examples and templets that can be used off-the-shelf help as well.

Rule of Transparency

Design for visibility to make inspection and debugging easier. It’s hard to underestimate the importance of this rule in minimizing costs, frustration and project delays. Best practice these days says that a Common Alerting, Logging & Exception (CALE) framework should be used for this purpose. Several vendors and consulting companies offer these in various forms. Some you get ‘free’ with consulting, some are offered as products in their own right, though usually unsupported except by more consulting. A CALE framework, whether purchased or bespoke, will repay the price many times over. Consider writing application specific tests that utilize this framework too. A CALE will find many uses in a SOA.

Rule of Robustness & Rule of Repair

Postel’s Prescription says: “Be liberal in what you accept, and conservative in what you send”. He wrote this in the context of traditional Unix network service programs (e.g. sendmail), but it’s a good strategy for SOA services too. In an environment with many services connected together and in use by unknown clients, be tolerant in handling inputs. Ensure that input processing is simple and that you conform strictly to the output specified for the service. If sending/receiving XML, it’s worth considering Doug McIlroy’s warning:

“The original HTML documents recommended “be generous in what you accept”, and it has bedeviled us ever since because each browser accepts a different superset of the specifications. It is the specifications that should be generous, not their interpretation.”

Repair what you can, but when you must fail, fail noisily and as soon as possible,which is simple enough not to warrant further comment.

Rule of Representation

Fold knowledge into data so program logic can be stupid and robust. Data representation and transformation are major challenges facing any SOA designer today, with plenty of literature on the topic. From the viewpoint of Unix Philosophy & SOA, consider simple data representations like JSON over  XML when possible. They’re easier to change and are mostly understandable by humans (see the Rule of Economy).

Rule of Least Surprise

In interface design, always do the least surprising thing. Does your organisation have conventions about the location, format or type of configuration file? If so, use them. It’s very unlikely that you’re going to come up with a better solution because your config file ends in ‘;’ instead of ‘\n’. In the same vein, if BO refers to ‘back office’, stick with the convention! The goal here is to ease the learning curve for others and minimise surprises.

Rule of Silence

When a program has nothing surprising to say, it should say nothing. If services are designed to work together, to create/consume data, silence is golden. If there’s nothing to report, don’t report anything. If there’s no filter/augment/transform to perform, do nothing; just pass the data along unchanged with nothing extra.

Rule of Economy

Programmer time is expensive; conserve it in preference to machine time. Although dated, it’s worth remembering that programmer time is still expensive, probably even more so today. Anything that increases programmer productivity, as many of these rules do is probably worth implementing.

Rule of Generation

Avoid hand-hacking; write programs to write programs when you can. CORBA has the IDL (Interface Definition Language) generator. While CORBA has generally fallen out of favour today, there are still many organisations using it, and IDL is one example of where complexity can be removed by code generations. In the Java world, JAXB is an example of where code generation can save many hours of programmer time. Standardizing these technologies across the enterprise (as opposed to ad-hoc usage) can yield many benefits.

Rule of Optimization

Prototype before polishing. Get it working before you optimize it. This is one of my favourites. I can’t tell you how many times I’ve seen companies spend countless hours discussing perceived performance issues before a single line of code was written — all attempting to address problems that are unlikely to ever be seen. Often the design work for the case encountered 0.0001% of the time wastes so much effort to sign off that the rest of the process is short-changed on resources. Eric writes:

‘Donald Knuth (author of The Art Of Computer Programming, one of the field’s few true classics) popularized the observation that “Premature optimization is the root of all evil”.’

This is one of the most overlooked, yet most important of the SOA design rules. If you really think there’s a performance problem, build a tool to test your hypothesis.

Rule of Diversity

Distrust all claims for “one true way”. (Unless you’re talking about emacs vs. vi). Make it easy to access services from multiple operating systems, languages, locations. Consider adding scripting languages to all services, via standardised interfaces (Guile is one of my favourites for this).

Rule of Extensibility

Design for the future, because it will be here sooner than you think. Plan on services being used in ways you haven’t imagined. Make data formats easy to change. Make configuration files extensible. The only constant today is change; services had better be ready for them.

Categories: soa Tags:

The Year of Living Asynchronously

13 January 2010 Saul Caganoff No comments

Happy New Year! Asynchronicity is busting out all over the web and my prediction is that 2010 will be the year of “events”:

  • Of course Twitter has brought The concept of publish/subscribe messaging to the masses and we enjoyed their journey of discovery to the heights of scalability in 2009.
  • XMPP has been embraced by the real-time web crowd, most publicly in Google Wave but also in other “back-web” contexts such as Gnip.
  • Web sockets is an experimental feature of HTML 5 which enables push messages directly to web pages.
  • New frameworks for event-driven programming are emerging such as EventMachineTwistedNode.js.
  • In 2009 every major software vendor had a CEP product.

In the meantime, SOA has become so damn synchronous. But it doesn’t have to be!

One of the fundamental tennets of SOA is that reducing coupling between systems makes them more scalable, reliable and agile (easier to change). SOA goes a long way to reducing coupling by providing a contract-based, platform independent mechanism for service providers and consumers to cooperate. However I still think we can improve on current SOA practices in further reducing coupling.

Coupling still intrudes into many aspects of how SOA is practiced today:

  • HTTP transports tie us to a regimen of synchronous request-reply with timeouts which creates tight couplings between provider and consumer. Even though one-way MEPs were an original feature of SOAP, message-oriented transports remain the forgotten orphan of web-services standards.
  • Many SOA services are conceived, implemented and maintained as point-to-point entities…providers and consumers forced into lock-step due to inadequate versioning and lifecycle management.
  • Process orchestration layers often form a bridge between service providers and consumers, which on the face of it provides some level of indirection. But in many cases orchestration provides limited value and may actually serve to increase the overall system coupling.

In many cases we can achieve the benefits of service orientation to much greater effect by exercising a little scepticism toward some of theseshibboleths of the web services world and embracing a more asynchronous, event-oriented way of building processes. So this year, embrace your asynchronous side and do something to reduce your system coupling: build some pub/sub services, learn about Event Processing or Event-Driven Architecture, try one of the technologies I pointed to above.

Just as developers should embrace multiple languages to broaden their skills, so should architects embrace and be fluent in multiple architectural styles.

Categories: Uncategorized Tags: , , , ,

WSO2 Business Activity Monitoring & Integrated Middleware Stack

16 December 2009 Steve Núñez No comments

WSO2 announced today the release of an integrated middleware stack, a business activity monitor and something they call a ‘Gadget Server’. Gadget server is a bit difficult to characterize, but it looks to be a yet another way of building customised portals utilising the Google Gadget specification. the Gadget server is intended to work closely with the Business Activity Monitoring (BAM) product, simply called BAM.

BAM is interesting from an EDM perspective, and the product sheet lists some useful functionality for building a decisioning dashboard:

  • Data visualisation (via the Gadgets)
  • Analytics
  • KPI monitoring

If a Drools were added into the mix, which is possible but non-trivial due to the lack of an OSGi interface, you’d have all the components needed for an open source decisioning stack.

WSO2 isn’t alone in offering these technologies as open source, but as I mentioned in a previous review of ESB 1.0, they have a minimalist and modular philosophy that I really like. I’m looking forward to a full review of these releases.

Categories: Decision Management Tags:

EDM & Real-time Analytics

2 October 2009 Steve Núñez No comments

Recently Nati Shalom wrote about real time analytics using map/reduce. Nati usually writes about middleware, SOA and the like (which are important in the context of providing enterprise wide decision services), but this entry he discusses mechanisms to perform real-time number crunching from a nuts-and-bolts viewpoint (mostly).

From an EDM perspective, this is interesting, especially in finance. While the majority of the models used as input for EDM solutions today, built by companies like FICO, SAS and SPSS are predictive models (usually of consumer behaviour), technologies such as map/reduce might allow us to incorporate things like VAR and performance metrics into our decision models.

Map/reduce techniques, especially combined with the easy of gathering, manipulating and storing extremely large datasets that ‘the cloud’ provides presents some interesting practical applications in the analytic space. I especially like this quote of Nati’s:

“For example it is possible to outsource the entire analytic process to someone else and use it as a service.”

for which we already have some examples, and remind me of Friedman’s ‘Flat World’.

Rule Engine Benchmarking: What does it mean?

29 September 2009 Steve Núñez 1 comment

Nothing in this small corner of the IT universe (rules engines) seems to spark more interest or attention than benchmarks. It’s nice to be able to engage in some technical debate again. These days most of my time is spent working with business users, explaining value propositions and other less challenging topics. So, for those technically minded, I hope to show that despite exaggerated vendor claims, trivial patents and fancy marketing terms like ‘linear inferencing’, ‘rete II’, ‘FastRete’, and others, no engine has a fundamental algorithmic advantage over the others.

Let me make a few comments about rules engines, performance and benchmarking, based on some of the recent discussion I’ve had so far.

1. Micro-benchmarks do not reflect real-world problems.
2. New benchmarks, that more closely match real-world business problem would be a welcome. Many vendors have these benchmarks already written, and there have been hints some might be released, but to date we’ve received nothing (hint, hint — you know who you are).

With that out of the way, let’s talk about the fundamental performance characteristics of rules engines. If there are no fundamental algorithmic advantages of one engine over the other, what exactly are we measuring with benchmarks? Simple: implementation efficiency. There are a number of way of implementing Rete (and derivatives), and each implementation will be more or less efficient depending on the resources spent, talent of the programmers and design goals.

Rules Engines and NP Class Problems

The idea for this post came when a friend of mine sent me a link to a recent ACM article on the P vs. NP complete problem. This problem has been known for years, and is the fundamental issue in rule engine performance, or any other computational performance problem. The article, in describing a NP Complete example, poses the question of devising an algorithm that could “sit students around a large round table with no incompatible students sitting next to each other”. Does this problem sound familiar? To anyone who’s read the definition of the Miss Manners benchmark, it should.

Rules engines are implementations of algorithms for the Boolean Satisfiability Problem on a given data set (leaving aside for a moment existential operators). The Wikipedia definition is repeated below:

In complexity theory, the Boolean Satisfiability Problem (SAT) is a decision problem, whose instance is a Boolean expression written using only AND, OR, NOT, variables, and parentheses. The question is: given the expression, is there some assignment of TRUE and FALSE values to the variables that will make the entire expression true? (e.g. a rule eligible to fire) A formula of propositional logic is said to be satisfiable if logical values can be assigned to its variables in a way that makes the formula true. The boolean satisfiability problem is NP-complete. (Emphasis mine)

NP-complete is the very hardest class of the NP problems.

Solving the Hardest Problems

To cut a long story short: There is no way to solve all NP-complete problems in less than exponential time. Some problems, using specialised algorithms, can do better but in the general case, it’s going to be exponential. The article gives an example of using a ‘cutting plane’ algorithm to solve the traveling salesman problem for up to 10,000 cities, and several other specialised cases where we can solve NP-complete problems in less than exponential time.

I haven’t formally examined all of the benchmarks we run, but at least two of them, Sudoku and Manners are NP-complete. I’ll go out on a limb here and assume that the others are too. So we have the situation where our benchmarks are trying to solve problems who’s worst case will lead to exponential performance (this means that the time required to solve the problem will increase exponentially with the size of the dataset).

Many of the special ‘modes’ offered in rules engines are attempts, like the ‘cutting plane’, to lower the time to solve a particular type of NP-complete problem. They are efficient algorithms for certain inputs (data sets), but not general ones. To state the obvious, they’re only going to work well in particular problem domains.

What’s that mean for Rules Engines?

Writing rules which perform well requires knowledge of the inputs and the algorithm used.

There has been a great deal of discussion around the need for users, especially business users, to select an algorithm or deal in messy technical details. I agree that, in an ideal world, the technique used to solve a problem would be irrelevant to the user. The reality is that it’s too easy to write a rule (i.e. construct a NP-complete algorithm) which, upon receiving inputs for which it was not optomised, will revert to exponential time.

The other lesson is this: despite all the patented-this, proprietary-that marketing hype being put out by the vendors, *all* engines suffer from the same problem. Unless the domain is severely restricted, a general-purpose rules engine can easily be forced into the only way we know of to solve a NP-complete problem: exhaustive search in exponential time.

Then why are some engines faster than others?

As hinted above, writing rules is really constructing an algorithm for solving a NP-complete problem. These algorithms attempt to be as efficient as possible in processing the search space to reduce the time to something less than exponential. Just how much the processing time can be reduced depends on the efficiency of the data structures used and the construction of the network. A good example is ‘sequential mode’, which gives up flexibility and expressiveness (thus solving a smaller class of problems) for speed. Mark Proctor gave a good technical explanation of Drools sequential mode in a ‘blog post describing the trade-offs.

Benchmarking is a valid test of the efficiency of these implementations on a general class of problem. Clearly not all business problems are of the general sort, and many can be solved by specialised implementations. Benchmarks, even micro-benchmarks are useful because they can identify weak implementations of the algorithm(s) and suggest areas for improvement.

It is these variations in implementation efficiency that account for the incredible difference in performance seen in, for example, opsj vs. jess. As we complete data collection from some of the other engines even more startling differences in efficiency can be observed.

Categories: Benchmarks, Rules Engine Tags:

Miss Manners Benchmark Performance: OPSJ vs Drools

17 September 2009 Ralph Jeffery No comments

We have arrived at the last of the OPSJ and Drools comparisons, this one for the Miss Manners benchmark.

Platform

Attribute Value
Server Type Amazon 32-bit virtual machine (EC2)
Server Memory 1.7 GB
OS Sun-OS Version 5.11
Java Version 1.6.0_13
JVM Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode)
JVM Memory (min) -Xms1024m
JVM Memory (max) -Xmx1024m

Process

For each data size we run 250 iterations and average the results to obtain a single data point.  Averaging smooths out the fluctuations found in the initial few runs or variances in the machine load. For anyone wishing to repeat these tests, the rules, object model and data sets for Opsj are available for download. The rules for JBoss Rules (Drools) are available from links in prior posts.

Rule Firing Time

This chart uses a logarithmic scale to show the difference in rule firing time between OPSJ and Drools. OPSJ is seen to be consistently faster, by a few orders of magnitude.

Miss Manners Benchmark - Drools v OPSJ 6 - Rule Firing Time

Miss Manners Benchmark - Drools v OPSJ 6 - Rule Firing Time

Data Load Time

A logarithmic axis is again used to to make the chart more readable. OPSJ is well ahead of Drools in data loading time.

Miss Manners Benchmark - Drools v OPSJ 6 - Data Load Time

Miss Manners Benchmark - Drools v OPSJ 6 - Data Load Time

Memory Usage

Well, this is interesting. For the first time we’re seeing a situation where OPSJ does not out perform Drools. Because OPSJ is commercial, we don’t have access to the source, and therefore can’t look under the covers to see why this might be so. It will be interesting to see what results anyone repeating our tests gets.

Miss Manners Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

Miss Manners Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

Conclusion

OPSJ is out in front on rule firing and data loading times. As far as memory usage is concerned, Drools 4/5 are the winners.

Categories: Benchmarks, Rules Engine Tags: ,

WaltzDB Benchmark Performance: OPSJ vs Drools

9 September 2009 Ralph Jeffery No comments

According to some fans of microbenchmarks, WaltzDB is the gold standard by which to measure. We agree that microbenchmarks have value, but believe them to measure only certain aspects of rule engine performance, aspects that might not matter to a customer. That said, here are the results comparing the leading performer, OPSJ, with Drools.

Platform

Attribute Value
Server Type Amazon 32-bit virtual machine (EC2)
Server Memory 1.7 GB
OS Sun-OS Version 5.11
Java Version 1.6.0_13
JVM Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode)
JVM Memory (min) -Xms1024m
JVM Memory (max) -Xmx1024m

Process

Like previous WaltzDB tests, we run from 1 to 50 regions. For each data size we run 250 iterations and average the results to obtain a single data point. Averaging smooths out the fluctuations found in the initial few runs. For anyone wishing to repeat these tests, the rules, object model and datasets are available for download. The versions for Drools are available from links in prior posts.

Rule Firing Time

Given the Waltz results, the results for WaltzDB are not suprising, with OPSJ in a commanding lead.

WaltzDB Benchmark - Drools v OPSJ 6 - Rule Firing Time

WaltzDB Benchmark - Drools v OPSJ 6 - Rule Firing Time

Data Load Time

The chart below shows a very similar pattern to that in the Banking benchmark, with a logarithmic axis used to highlight the differences. OPSJ is clearly ahead of the others.

WaltzDB Benchmark - Drools v OPSJ 6 - Data Load Time

WaltzDB Benchmark - Drools v OPSJ 6 - Data Load Time

Memory Usage

Similar memory usage pattern for OPSJ. We wonder why all these engines have huge dips at various points in their run. Probably garbage collection. It would be interesting to graph this out to 250 regions and observe the results. As soon as JRules completes it’s run and we get the machine back we’ll do that.

WaltzDB Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

WaltzDB Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

Conclusion

For this WaltzDB benchmark, OPSJ is ahead of Drools in terms of run times and memory usage.

Categories: Benchmarks, Rules Engine Tags: ,

Waltz Benchmark Performance: OPSJ vs Drools

2 September 2009 Ralph Jeffery No comments

Continuing to compare OPSJ with the other rules engines, I’m reminded of the phrase “shooting fish in a barrel”. Almost doesn’t seem fair to the other engines. CLIPS is the only one faster, but because the new framework doesn’t yet include a JNI interface to CLIPS, we can’t easily compare them.

Platform

Attribute Value
Server Type Amazon 32-bit virtual machine (EC2)
Server Memory 1.7 GB
OS Sun-OS Version 5.11
Java Version 1.6.0_13
JVM Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode)
JVM Memory (min) -Xms1024m
JVM Memory (max) -Xmx1024m

Process

For Waltz we run from 1 to 50 regions. For each region size we run 250 iterations and average the results to obtain a single data point.  Averaging smooths out the fluctuations found in the initial few runs. For anyone wishing to repeat these tests, the rules, object model and datasets are available for download. The rules for JBoss Rules (Drools) are available from links in prior posts.

Rule Firing Time

A logarithmic scale isn’t (quite) necessary here. The speeds diverge quickly, with OPSJ in a commanding lead early on.

Waltz Benchmark - Drools v OPSJ 6 - Rule Firing Time

Waltz Benchmark - Drools v OPSJ 6 - Rule Firing Time

Data Load Time

The chart below shows a very similar pattern to that in the Banking benchmark, with a logarithmic axis used to to make the chart more readable. OPSJ is clearly ahead of the others.

Waltz Benchmark - Drools v OPSJ 6 - Data Load Time

Waltz Benchmark - Drools v OPSJ 6 - Data Load Time

Memory Usage

Similar smoothly scaling memory usage pattern for OPSJ.

Waltz Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

Waltz Benchmark - Drools v OPSJ 6 - Post-Run Memory Used

Conclusion

Not a great deal of drama when you’re this far out ahead. Among the Java based rules engines, OPSJ is in the lead and has handily beaten Drools in the first 2 benchmarks.

Categories: Benchmarks, Rules Engine Tags: ,

Banking Benchmark Performance: OPSJ vs Jess vs Drools

26 August 2009 Ralph Jeffery 2 comments

The OPSJ run finished in the blink of an eye compared to the previous runs. Here’s the side by side comparison of OPSJ 6.0.0 with Jess 7 and Drools 4/5 on the Banking micro-benchmark.

Platform

Attribute Value
Server Type Amazon 32-bit virtual machine (EC2)
Server Memory 1.7 GB
OS Sun-OS Version 5.11
Java Version 1.6.0_13
JVM Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode)
JVM Memory (min) -Xms1024m
JVM Memory (max) -Xmx1024m

Process

5,000 to 125,000 transactions, in steps of 5,000, 250 iterations, averaged. The rules and object model for OPSJ are available for download. The versions for the other engines are available from links in prior posts.

Rule Firing Time

This race is not even close; OPSJ is well out in front for rule firing time.

Banking Benchmark - Drools v Jess v OPSJ 6 - Rule Firing Time

Banking Benchmark - Drools v Jess v OPSJ 6 - Rule Firing Time

Data Load Time

We had to use a logarithmic axis for this chart to highlight the difference in data load time between the engines. Again, OPSJ is clearly ahead of the others.

Banking Benchmark - Drools v Jess v OPSJ 6 - Data Load Time

Banking Benchmark - Drools v Jess v OPSJ 6 - Data Load Time

Memory Usage

OPSJ shows a nice smooth increase in memory usage. Apparently it scales it’s memory consumption directly with the number of objects.

Banking Benchmark - Drools v Jess v OPSJ 6 - Post-Run Memory Used

Banking Benchmark - Drools v Jess v OPSJ 6 - Post-Run Memory Used

Conclusion

That was easy; no contest. OPSJ is the clear winner in all respects, which is not surprising given it’s design philosophy as an extension of the Java virtual machine to support rule processing.

Categories: Benchmarks, Rules Engine Tags: , ,

Banking Benchmark Performance: Jess vs. Drools

19 August 2009 Ralph Jeffery 5 comments

We’re starting to receive data from the other engines and wanted to give a preview of head to head competition by comparing Jess 7.1.2 with Drools 4/5 in the banking benchmark. In order to compare apples-to-apples, this version of banking uses POJO’s with Jess. Sadly we didn’t have time to convert Jess to use POJOs for the other benchmarks, but Ernest has promised us some help here, so hopefully we’ll be able to test Jess with Waltz, WaltzDB and Manners soon.

Platform

Attribute Value
Server Type Amazon 32-bit virtual machine (EC2)
Server Memory 1.7 GB
OS Sun-OS Version 5.11
Java Version 1.6.0_13
JVM Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode)
JVM Memory (min) -Xms1024m
JVM Memory (max) -Xmx1024m

Process

5,000 to 125,000 transactions, in steps of 5,000, 250 iterations, averaged. The rules and object model for Jess are available for download. The Drools versions are available from links in our first review. It is worth noting that the object models used for all micro-benchmarks (Banking, Waltz, WaltzDB and Miss Manners) are the same for each rule engine. This has not been the case in the past; previously each engine used a different object model, making this the first true apples-to-apples comparison of rules engine performance we’re aware of.

Rule Firing Time

It’s a close race, but Jess/7 is consistently, but only slightly so, slower than Drools.

Banking Benchmark - Drools v Jess 7 - Rule Firing Time

Banking Benchmark - Drools v Jess 7 - Rule Firing Time

Data Load Time

Jess/7 shows a very linear, but angular data load graph. This might be internal data structures resizing themselves. In any event, it’s clear that loading POJOs into Jess takes much more time than in Drools.

Banking Benchmark - Drools v Jess 7 - Data Loading Time

Banking Benchmark - Drools v Jess 7 - Data Loading Time

Memory Usage

Previous benchmark results have shown that there is little difference between pre and post run memory, so we now collapse memory statistics into a single graph (post-run memory) and display the results for each engine side by side. As you can see from this chart, Jess/7 appears to use slightly less memory overall than Drools/4 or Drools/5.

Banking Benchmark - Drools v Jess 7 - Post-Run Memory Used

Banking Benchmark - Drools v Jess 7 - Post-Run Memory Used

Conclusion

In speed, Jess/7 is slightly slower in this benchmark, and significantly slower in data load times. For memory usage, Jess/7 appears to have a slight advantage.

Categories: Benchmarks, Rules Engine Tags: ,