Monday, December 29, 2008

Object Oriented Rule Engine

Last week I discussed the desire for a rule engine to allow object oriented domain logic to be browsed, searched, reviewed, and altered outside of an IDE.

Earlier this year Zepheira, LLC needed an object oriented rule engine to enable different participants to manage continually changing rules. After developing a solution with Drools, it was observed that there were too many restrictions on the accessible data and the rules were getting too complicated to be effective with multiple participants each with different interests. Together we combined the modelling language - OWL (Web Ontology Language) - with the Object Oriented language Groovy to create a rule engine that was effective over large datasets and included many of the features of OOP including the ability to override selected rules for particular sub-domains within the model.

We developed the model using OWL and extended it by adding the ability to define message types (LHS rule declarations) and method bodies (RHS rule execution). By separating the two and associating every rule with a concept class, we were able to apply OOP features to the rules. The syntax we used can now be seen here:

By developing Groovy domain logic in an RDF database we enabled many stakeholders access to review and alter the rules to facilitate their own what-if scenarios. It also made it possible to manage much more complex situations by utilizing the strong vocabulary of OWL to describe the concepts that could not easily be represented in a simple object model.

We achieved the flexibility we needed and reduced the complexity of the rules by using an object oriented rule engine. The rule engine has since been licensed under the BSD licence as part of the OpenRDF Elmo project at

About Zepheira
Zepheira is a US-based professional services firm with expertise in semantic technologies and Enterprise Data Integration. For more information, visit:

Reblog this post [with Zemanta]

Monday, December 22, 2008

Object Oriented Rule Engine

A rule engine should be used when a domain model has many rules that may change over time and need to be closely managed by domain experts.

By using a rule engine, rules become much more tangible. They can be browsed, searched, reviewed, and altered in a way that facilitates their management without the complications and restrictions of a full-blown IDE.

Rule engines are often avoided by DDD-ers (Domain Driven Designers) and OOP-ers (Object Oriented Programmers) because they pull domain logic out of the model and into a rule structure, disregarding many of the core features of OOP. (Separating the rules prevents classes from encapsulating their own behaviour, and doesn't allow inheritance or any form of overriding super class behaviour.)

Drools, a popular Java rule engine, uses an OOP language (Java-like) to write the RHS (rule execution) and to some degree, the LHS (rule condition). However, the rules themselves are still very much detached from the model. The rule is more like an /if/ condition with a /then/ procedure. If the code does not belong to an extend-able class hierarchy, it is anything but OO.

Tools could be written to allow a OO model, written in a language like Java, to be indexed and allow it to be browsed and searched - many IDEs already do this to a degree. However, the challenge is that the Java language (like many other languages) does not distinguish between the concepts (the classes and properties) and the behaviours (the rules). In Java, it is particularly difficult to distinguish between a property that retrieves and stores values and a method that applies logic to the state of the model.

To combine the features of OOP with the advantages of a rule engine, the domain model needs to be stored in a more formal structure to facilitate its management. It needs to be in a structure that clearly distinguishes the concepts and their properties from the behaviour rules, while providing OOP features to them. Such a structure would need to be index-able to allow it to be browsed and searched and straight-forward enough to allow alteration of individual rules without the risk of data loss.

Does such a rule engine exist? Check back next week.

Reblog this post [with Zemanta]

Thursday, December 18, 2008

Repository Entity Pattern

The repository pattern is too often under used when developing a domain model. I like to take a spin off the Domain Driven Design repository pattern and apply it to all collections within the domain.

Whenever the model needs a collection of entities, I use a repository. It is basically a glorified collection, controlling access to underlying entities in a type specific way. The main advantage to using a souped-up collection is that as needs change the implementation can be changed while maintaining the same interface. This abstraction also co-locates similar queries and query building logic into a single class structure, minimizing its duplication.

Writing access and bulk update queries involves a significant investment for new models. The abstraction of using a repository interface allows you to focus on the business logic early on, while working with small datasets and in memory collections. As the model interfaces begins to stabilize more focus can be in optimizing entity access.

For complex models that require uniquely optimized data access and updates. The repository pattern allows integrated query building logic to be separated and shared within it own class structure.

An anti-pattern to be aware of when using the repository pattern is to ensure that you don't try and combine aggregates and repositories together. A repository should not have any properties, only a collection of entities. Violating this significantly complicates the implementation and leads to role confusion exacerbating the problem.

A good definition of the pattern by Martin Fowler can be found on his website.

There isn't many good examples of the repository pattern, so here is an example interface of what it might look like.

interface Repository extends Iterable {
void add(E entity);
void remove(E entity);
void clear();
E findById(String id);

interface SequentialRepository extends Repository {
E get(int index);
int indexOf(E entity);

interface ScoreRepository extends SequentialRepository {
List findScoresByCategory(Category cat);
void setScoreLimit(int limit);
Score getMax();

Reblog this post [with Zemanta]

Monday, December 15, 2008

Friday, December 5, 2008

Rich Internet Applications

Silverlight, Flex, and JavaFx are all trying to capture a new market of rich Internet applications that promise to be the best of both desktop applications and web applications. However, none of them are accessible to agents. Like desktop applications, they can only be used for the purpose they were designed for and cannot be used for indexing, harvesting, or used in mashups. This is not surprising as none of their businesses depend on open interoperability. The only big web players that depend on open interoperability are Google and Yahoo!, which is why Google funds the Mozilla and Chrome browser projects.

The web's biggest success is the inter-connectivity and vast amount of information available in the same format (HTML). It will be interesting to see if these new RIA platforms will have any effect on the trend of moving applications to the web.

Reblog this post [with Zemanta]

Monday, December 1, 2008


When Spring MVC (Model View Controller) was first released in 2003, it helped clarify the roles and bring a clearer separation between the model, the view and the controller to Java web development. Complicated HTML pages became easier to maintain and changes were straight forward to implement. However, web applications have changed significantly since then.

With the recent explosion of new JavaScript libraries the separation between MVC has once again become blurred. Many HTML pages today are loaded in stages (a la AJAX), one entity at a time. Traditional Spring MVC seems overly complicated for small single entity results.

Creating rich AJAX web applications can still follow the MVC design pattern, although it might require stepping out of the Java/JavaScript comfort zone.

Consider a typical asynchronous request:
1) User's activity triggers an HTTP request.
2) The server processes the request and may invoke changes to the model and/or return part of the model (an entity) to the client.
3) The script that sent the request, manipulates the response and displays it for the user.

In the above we can still see the MVC pattern: the model is the server's response, the view is the manipulation of the response for display, and the controller is the server. However, this differs from the original Spring MVC of 2003 - the view has moved to the client and the model is (in part) serialized in an HTTP message body.

Part of the confusion with AJAX development is around the role of the "view". It gets blurred between the serialization of an entity model and how JavaScript displays it. Inconsistencies in this area can cause many maintainability issues as the interaction between the client and server becomes confusing.

By viewing dynamic HTTP responses as serializations of entities from a data source (model), and leaving the "view" for the client, clarity and maintainability can be achieved. The only standard display technology that works equally well for large entities and entity fragments is XSLT/HTML. Today's modern browsers all support XSLT 1.0 transformations using JavaScirpt. By using XML for the model interchange and XSLT/HTML for the view display, JavaScript usages can be limited to what it does best: filling in missing functionality of the browser.

By limiting the role of JavaScript, its reusability is maximized. In the future the amount of JavaScript required by web applications should be significantly reduced. In addition, projects like webforms2 (at google code) promise to bring tomorrow's HTML5 and Web forms 2.0 to today's browsers.

XSLT brings more flexibility and reusability to the view layer (vs JSP-like technologies). By using client-side JavaScript/XSLT with other AJAX technologies modern web applications can achieve the richness of desktop applications, while still using best practises of the MVC design pattern.

Reblog this post [with Zemanta]

Monday, November 24, 2008

Open Concepts

Most examples of open classes are extensions of core system classes (string/float), such as adding a "to_cents" method on Float. Often such extensions are not needed, as a new wrapper class, such as "Money", would serve the purpose better.

Core classes may not need to be open for extensions, although that can sometimes be useful (remember how long it took Sun to add java.lang.String#replace(String,String)). It is important for Model classes to be open for downstream extensions. For this reason I use the term "Open Concepts" to indicate that the concepts in the model are extendable.

The biggest advantage for having "Open Concepts" is that the model can be modified without modifying the original source. This creates many opportunities for multiple team system development. By enabling the model to permit dynamic integration of independent and decentralized designs (open), the code becomes simplified and more useful to a wide array of use-cases. Consider the productivity advantages of allowing downstream development to have more influence to extend the supplied model without necessarily involving upstream in the changes. These are many of the advantages that allowed the Ruby language to gain so much popularity.

There are many ways to enable this, even in more mainstream languages. One popular approach is to implement a plug-in system for key points in the model (although this quickly become complicated with many "key points").

A more generic way is to implement the role class model in all the concepts. This allows every individual to expose itself through multiple concepts. However, anyone that has used Mozilla's XPCOM, knows first hand how the syntax of this pattern can get out of hand.

Today, with the popularity of new languages that support the run-time mixing of classes, new opportunities for creating an expendable model now exist. Such an approach allows both plugin-style and role-style extensions, but still uses the syntax of the host language. This allows the model to be used the same way as if no extensions are present.

Any large system will need some type of extendability be it a plugin-style, a role-style, or an open-style model. It is worth considering the approach early in the development cycle, as many of these approaches are intrusive, by syntax or language.

Reblog this post [with Zemanta]

Monday, November 17, 2008

Storing Business Rules

Every application has some domain logic or business rules, but how much thought goes into how it is stored? Try asking yourself the following questions to help identify possible ways to serialize the domain logic.

How often will it change? Will each deployment need to have customized rules, will they be changing at run-time?

Who will be managing the domain logic? Will it be managed by people with a particular skill set? Does the domain logic need to have its own version control and distribution channel? Will the domain logic be reviewed by independent parties?

What form will the domain logic take? Can it be defined in a formal structure? Is it data driven?

There are many ways that domain logic could be stored, the most common is not always the best. Consider the following ways to serialize and store domain logic. You might use the same language as the application logic, or You might encode it in a domain specific language.

Domain logic might be stored in flat files that are compiled or in a form more suitable for a rule engine. It could be embedded in one or more XML files, for indexing and quick access. Domain logic might also be stored in a formal structure, for user-initiated rules or dynamic version control.

Domain logic can take on many forms and how it will be stored should be consider carefully. Depending on the makeup of the team or other external influences there maybe a need to make a greater distinction between domain and application logic.

Reblog this post [with Zemanta]

Thursday, November 13, 2008

Utilizing a Multi-Core System with the Actor Model

Demand for multi-core/multi-processor applications is growing, but developing for a multi-threaded application does not require a steep learning curve or an understanding of complicated edge cases. Learn how to develop efficient multi-threaded applications without using synchronized blocks.

Reblog this post [with Zemanta]

Monday, November 10, 2008

What a Modelling Language Should Look Like

InfoQ gave a summary yesterday of a few blog posts on Modelling languages.
Briefly, Steven Kelly and Juha-Pekka Tolvanen say such a language should:
1) Map to the domain problem concepts and not implementation details.
2) Be formalized and helpful.
3) Have stand alone tooling support.

Unfortunately, the discussion focused on a visual modelling language (UML) and did not address any alternative modelling languages, such as the standard Web Ontology Language (OWL), which addresses many of these issues. Although OWL was not designed as a visual language or arbitrary code-generation, it has grown to support such usage.

TopBraid Composer, for example, is a mature editor and visualization tool for OWL.

Reblog this post [with Zemanta]

Thursday, November 6, 2008

Module Dependencies

I was reminded recently of a domain model I used to work with that did not separate interfaces from their implementations. They never considered dependencies to be an issue because javac could compile the entire source tree at once.

When I came on board, their product was nearing it's first release and their domain model's class dependency chart looked like a bowl of spaghetti. Eventually the lack of dependency management become apparent: each team was developing at a different pace, but everyone was forced into monolithic releases.

Much of the circular dependencies were addressed by introducing interfaces or abstract classes. However, the process was slow and full of challenges. Hibernate only supports a single inheritance model, but with so many different perspectives such a restriction proved very limiting.

Many of these problems could have been avoided had more thought gone into the the conceptual class hierarchy earlier and distinct modules been created to force developers into observing and thinking about class dependencies early in the development cycle.

Reblog this post [with Zemanta]

Monday, November 3, 2008

Where are programming languages going?

Anders Hejlsberg gave a keynote last month at JAOO (published two weeks ago) entitled "Where are programming languages going?". He said that with a little more restriction in languages, compilers could do a much better job at optimizing the code. I think this is an interesting discussion and agree with many of he ideas.

Towards the end of the talk he addressed concurrency. He suggested that a for loop, for example, could automatically execute all iterations in parallel - if few mutable variables existed. This is an interesting idea, although a very difficult task. Unless the code fits into the map-reduce pattern, it will likely modify some shared memory structure or external state (read database) and could not safely be executed in parallel. I think we will have to see more integration between persistence frameworks and optimizers before such as idea could be realized.

Reblog this post [with Zemanta]

Thursday, October 30, 2008

What is the Semantic Web?

I get this question a lot and below is what the semantic web means to me. What do you view as the semantic web?

Software is built around models: logical representations of a system of entities. Models are an important part of software development, as the user's experience is often a reflection of the underlying model. Any flaws in the model will cascade up all the way to the user's interaction with the system.

Today's software programs have many different users with many different perspectives. Attempting to satisfy everyone with a single model leads to complexity that make the model difficult to understand and use.

The semantic web provides standards to express interconnected models by capturing and contextualizing the knowledge about them. Semantic web technologies allows one to access the larger conceptual data model and from more abstract points of view. This enables both people *and* software to navigate between both high level and detailed views so information can easily be examined and modified from different perspectives.

Reblog this post [with Zemanta]

Monday, October 27, 2008

Keeping Classes Small

Most people would agree that smaller classes are better. It is generally accepted that smaller classes are associated with robustness, reliability, reusability, and understandability. However, what causes a class to become too large?

Many people instinctively try to implement a interface or concept as a single class. However, sometimes the implementation is too complicated to be understood quickly and the class size gets too large. If you think this might be the case, take a look at the class variables. If the number of variables is large, if some of the variables could be "grouped" together, or if some of the variables are only used some of the time, try and split the implementation details into their own classes and allow the original class to delegate to or compose them.

Every time you are creating or adding to a class ask yourself if this could better explained in a separate class. This will help you or a co-developer later find what it is they are looking for faster.

Reblog this post [with Zemanta]

Friday, October 24, 2008

Sesame 2.2.1 Released

This marks the first stable release of Mulgara's Sesame interface. Creating a unified API to access many specialized RDF stores.

Other RDF stores that support the Sesame API include:

For more information about Sesame see:

Thursday, October 23, 2008

Extracting Meaning from Text with OpenCalais R3

This article shows how to convert unstructured written text into structured data using OpenCalais, which is a public general-purpose text-extraction service that uses a combination of statistical and grammatical analysis to extract meaning.

Reblog this post [with Zemanta]

Monday, October 20, 2008

Smaller Class Sizes

We hear this a lot wrt education, but can it also help programming? I think so!

Code should be written in a way that explains what the code "should" be doing. This can only be done by continually giving the reader a summary of the operations that need to be performed. Think of each class as cheat sheet: it contains what is important to understand its behaviour. A class should not go into too much detail, but instead delegate to other classes for further details, leading to smaller classes. Every computer operation can be divided into sub operations. These sub operations may not be as important to a reader and should not distract or confuse the reader when trying to understand the purpose of the class.

Reblog this post [with Zemanta]

Tuesday, October 14, 2008

Easy To Read Code

If we wanted to write code for a computer, we would be using only 0s and 1s, but we don't. Code is written for human consumption. It is more important that a co-developer can easily "compile" the code than any machine. If any code contains a bug, a machine isn't going to fix it - only a human will. Therefore we should take extra care to make the reader's job as easy as possible by making the code as easy as possible to read. Any code worth writing is worth writing well.

Unlike a computer, we can't read one line at a time. Our natural field of focus has a limited width and this needs to be considered when writing and creating APIs. This courtesy leads coders to limiting the length of each line. The easiest way to correct long lines of code is to use more local variables to allow operations to be separated on separate lines. However, too often this is not enough.

Often libraries use overly verbose and repetitive names. This is done in the name of clarity, but at the expense of the readability of their users' code. API developers need to keep not only the readability of their own code in mind, but also the readability of their users' code. Here are four examples that I have seen recently, of APIs that force their users into writing difficult to read code:
  • ServletConfig#getServletContext() - The word Servlet is redundant in the method and could be removed to shorten calling code.
  • org.openrdf.repository.RepositoryConnection - Unless it is common to work with multiple connections from different packages, there is no need to repeat the word "repository" in the package and class prefix.
  • IReadableBinaryStreamRepresentation - Here is an example of a repeating suffix that adds nothing to clarify its use.
  • context.getKernelContext().getThisKernelRequest().getRequestScope() - If the API forces users in this type of repetitive message chaining, not only is the API forcing code that is hard to read, but it also couples the client to the structure of the navigation and any changes to the intermediate relationships forces the client to also change.

To any API designers (or would be API designers) out there: Please spend some time thinking on how you can reduce the repetition in your API and strive to use short concise names. It will go a long way to making more readable code.

Reblog this post [with Zemanta]