Thursday, October 30, 2008

What is the Semantic Web?

I get this question a lot and below is what the semantic web means to me. What do you view as the semantic web?

Software is built around models: logical representations of a system of entities. Models are an important part of software development, as the user's experience is often a reflection of the underlying model. Any flaws in the model will cascade up all the way to the user's interaction with the system.

Today's software programs have many different users with many different perspectives. Attempting to satisfy everyone with a single model leads to complexity that make the model difficult to understand and use.

The semantic web provides standards to express interconnected models by capturing and contextualizing the knowledge about them. Semantic web technologies allows one to access the larger conceptual data model and from more abstract points of view. This enables both people *and* software to navigate between both high level and detailed views so information can easily be examined and modified from different perspectives.

Reblog this post [with Zemanta]

Monday, October 27, 2008

Keeping Classes Small

Most people would agree that smaller classes are better. It is generally accepted that smaller classes are associated with robustness, reliability, reusability, and understandability. However, what causes a class to become too large?

Many people instinctively try to implement a interface or concept as a single class. However, sometimes the implementation is too complicated to be understood quickly and the class size gets too large. If you think this might be the case, take a look at the class variables. If the number of variables is large, if some of the variables could be "grouped" together, or if some of the variables are only used some of the time, try and split the implementation details into their own classes and allow the original class to delegate to or compose them.

Every time you are creating or adding to a class ask yourself if this could better explained in a separate class. This will help you or a co-developer later find what it is they are looking for faster.

Reblog this post [with Zemanta]

Friday, October 24, 2008

Sesame 2.2.1 Released

This marks the first stable release of Mulgara's Sesame interface. Creating a unified API to access many specialized RDF stores.

Other RDF stores that support the Sesame API include:
OWLIM
Virtuoso
BigData
AllegroGraph

For more information about Sesame see:
http://www.openrdf.org/

Thursday, October 23, 2008

Extracting Meaning from Text with OpenCalais R3

This article shows how to convert unstructured written text into structured data using OpenCalais, which is a public general-purpose text-extraction service that uses a combination of statistical and grammatical analysis to extract meaning.

http://www.devx.com/semantic/Article/39550

Reblog this post [with Zemanta]

Monday, October 20, 2008

Smaller Class Sizes

We hear this a lot wrt education, but can it also help programming? I think so!

Code should be written in a way that explains what the code "should" be doing. This can only be done by continually giving the reader a summary of the operations that need to be performed. Think of each class as cheat sheet: it contains what is important to understand its behaviour. A class should not go into too much detail, but instead delegate to other classes for further details, leading to smaller classes. Every computer operation can be divided into sub operations. These sub operations may not be as important to a reader and should not distract or confuse the reader when trying to understand the purpose of the class.

Reblog this post [with Zemanta]

Tuesday, October 14, 2008

Easy To Read Code

If we wanted to write code for a computer, we would be using only 0s and 1s, but we don't. Code is written for human consumption. It is more important that a co-developer can easily "compile" the code than any machine. If any code contains a bug, a machine isn't going to fix it - only a human will. Therefore we should take extra care to make the reader's job as easy as possible by making the code as easy as possible to read. Any code worth writing is worth writing well.

Unlike a computer, we can't read one line at a time. Our natural field of focus has a limited width and this needs to be considered when writing and creating APIs. This courtesy leads coders to limiting the length of each line. The easiest way to correct long lines of code is to use more local variables to allow operations to be separated on separate lines. However, too often this is not enough.

Often libraries use overly verbose and repetitive names. This is done in the name of clarity, but at the expense of the readability of their users' code. API developers need to keep not only the readability of their own code in mind, but also the readability of their users' code. Here are four examples that I have seen recently, of APIs that force their users into writing difficult to read code:
  • ServletConfig#getServletContext() - The word Servlet is redundant in the method and could be removed to shorten calling code.
  • org.openrdf.repository.RepositoryConnection - Unless it is common to work with multiple connections from different packages, there is no need to repeat the word "repository" in the package and class prefix.
  • IReadableBinaryStreamRepresentation - Here is an example of a repeating suffix that adds nothing to clarify its use.
  • context.getKernelContext().getThisKernelRequest().getRequestScope() - If the API forces users in this type of repetitive message chaining, not only is the API forcing code that is hard to read, but it also couples the client to the structure of the navigation and any changes to the intermediate relationships forces the client to also change.

To any API designers (or would be API designers) out there: Please spend some time thinking on how you can reduce the repetition in your API and strive to use short concise names. It will go a long way to making more readable code.

Reblog this post [with Zemanta]