Tuesday, May 20, 2008

Ontologies Applied to Business Intelligence

After a full day of seminars at the Semantic Technology conference, I ran into an interesting tool that is "new to me". This tool is called Ontology.

An ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

OK, I have to admit this is a pretty high level and cerebral topic, lets put this in the context of an example to make it real. Lets say we have a business problem (i.e. the domain) that involves displaying a customer's invoice online. If you think about it, this problem typically has the following entities involved:
  • Customer
  • Product
  • Marketing Strategy
  • Person
  • Invoice

Does this seem like an exercise in Entity Relationship (ER) modelling? Well to a certain extent it is, but the value an ontology adds on top of this is how these entities are related. This still seems like ER modelling, but lets see how this plays out...

  • A Customer subscribes to Products
  • A Customer is a Person
  • A Marketing Strategy acquires Customers
  • A Marketing Strategy sells Products
  • An Invoice belongs to a Customer
  • An Invoice has Products
  • An Invoice includes Marketing Strategies (Bill Messages)
  • A Person creates a Marketing Strategy

Are all you data modellers out there feeling confused by all these relationships? 8) Real world relationships seldom fall into the typical hierarchical relationships so common in ER modelling. To fully describe the richness of all relationships we have to step back from the physical data structure and build out a separate meta data store that does not care about the structure, but does care about the context of the data and its relationships. This is typically stored within a database as a Triplestore, which breaks down relationships into "subject" "predicate" and "object".

A Person (subject) creates (predicate) a Marketing Strategy (object)

If each of these 3 pieces had a Uniform Resource Identifier (URI) that uniquely identified each piece, a deceptively simple data model can be created to handle any object and any relationship. In this way any object in the system can be related to any other object using 1 data model.

Once we have this ontology, the next logical step is to map this ontology to the physical data. This not only helps business users navigate data, it is a great tool to facilitate data integration efforts as we can map any data source against the business focused ontology.

What is the value of this for Business Intelligence? In the BI field we are constantly striving to take data and turn it into actionable intelligence. If we had a rich meta data layer that contained a validated ontology we could use this data to uncover correlations in data that would not be uncovered through a simple ER model. As the ontology is in business language it serves as a great tool to bridge the gap between physical data and business entities. This facilitates communication of the organization's data assets between business and IT to ensure no bit or byte goes un-leveraged.

This is but one tool to allow us to bring our users into the development of their applications, better yet lets get the business to own this "layer" since they know it best!

Mark

3 comments:

Jagan said...

Hi Mark,
I am sure you are having a geat time at SJ. Thanks a lot for the well explained section about Ontology though I have one question.
How different is this when you compare it with the way Unified Process of Systems Analysis works because in this methodolgy, we have Use Case Diagrams that shows actors and the association between actors and use cases. Also, there are sequence diagrams that show an interaction between objects arranged in time sequence, object lifeline and even activation periods. So are they really different when it comes to applying these methodologies(Sorry,my ability to make the comparison may be wrong)? Something new though is the TripleStore or when you mention where the Ontology is mapped to the real data store which is never talked about in Unified Process.

Hope to see more valuable informations.

Thanks, Jagan

Mark Cudmore said...

Hi Jagan,

Thank you for taking the time to ask some questions! You are right that this idea of relating objects in a process is certainly not new. The difference here is that you are not building documentation to support the System Development Life Cycle, you are actually taking this documentation (we still need to do this) and store the results in the database. We can then use this meta data to enhance the user experience with BI applications.

Another point is that it is not feasible for IT to model every single relationship in a system. What we need is to provide tools to our end users to maintain this meta data so that it is constantly being enriched. How sweet it is to have users keeping our documentation up to date! 8)

Mark

Jagan said...

Hi Mark,
Thanks for th your reply. Now I can see the difference and the purpose behind this. Very cool stuff.

Jagan

This is a personal weblog, and does not represent the thoughts, intentions, plans or strategies of my employer.