Milenko Petrović

Middleware for the Semantic Web

Semantic Web is the next generation of world wide web aimed at computer rather than human users. The main idea of the semantic web is to annotate various digital resources using special languages called ontologies. Ontologies are knowledge representation languages. Using information from ontologies can help computers identify what the digital resources represent and, hence, it can make computers easier to program.

The main benefit of publish/subscribe communication model lies in the abstraction of addressing. In essence, pub/sub replaces host-centric with data-centric addressing. Pub/sub systems are mainly differentiated based on the expressiveness of the data-centric abstraction.

The goal of this project is to examine data-centric abstractions that would be considered useful for application development in the future. We used the emerging areas of distributed application development such as service integration, and collaboration as a guide to what would be considered useful for a number of applications.

Existing publish/subscribe systems do now allow use of ontologies for publication or subscription annotations. We have developed a technique that allows any existing pub/sub system that uses attribute-value pairs publication representation to use externally specified ontology information. The main benefit of S-ToPSS is that it can do semantic filtering very fast by mapping semantic matching to syntax matching.

While very efficient, S-ToPSS suffers from the constrains of attribute-value pair publication representation language. Popular semantic application for semantic web are expected to use RDF/XML for data representation, which require more complex subscription and data representation languages.

Consequently, we have developed, G-ToPSS, a pub/sub system for filtering graph-based metadata in general and RDF in particular. G-ToPSS uses a powerful GQL query language to express subscriptions over publications which are expressed as RDF graphs. Despite of the expressiveness of GQL, G-ToPSS is able to filter 100,000s of subscriptions in matter of milliseconds for very large publications.

Publications

Liu, H., Petrovic, M., Jacobsen, H.-A., Efficient and Scalable Filtering of Graph-based Metadata. To appear in Journal of Web Semantics. January 2006.
Petrovic, M., Liu, H., Jacobsen, H.-A., CMS-ToPSS: Efficient Dissemination of RSS Documents. In Proceedings of 31st International Conference on Very Large Data Bases (VLDB). September 2005. (System demonstration)
Petrovic, M., Liu, H., Jacobsen, H.-A., G-ToPSS: Fast Filtering of Graph-based Metadata. In Proceedings of 14th International World Wide Web Conference (WWW). May 2005. Best Paper Finalist.
Burcea, I., Petrovic, M., Jacobsen, H-A. I know what you mean: semantic issues in Internet-scale publish/subscribe systems. In Proceedings of the First International Workshop on Semantic Web and Databases. September 2003
Petrovic, M., Burcea, I., Jacobsen, H-A. S-ToPSS: Semantic Toronto Publish/Subscribe System. In Proceedings of 29th International Conference on Very Large Databases (VLDB). September 2003. (System demonstration)