DERI Galway
National University of Ireland, Galway   Science Foundation Ireland
 

On this page past reading groups are listed.

Meronymy-Based Aggregation of Activities in Business Process Models

based on Meronymy-Based Aggregation of Activities in Business Process Models by Sergey Smirnov, Remco Dijkman, Jan Mendling, Mathias Weske

Wassim Derguech

Abstract

Currently, more and more enterprises are using process models as a knowledge artefact for preserving the required behavioural knowledge on how to achieve goals (or how to do things). Such models tend to be very large and various. This makes the understanding of process models a difficult task. A possible solution is to consider business process model abstraction that helps to reduce the process models complexity (in terms of number to nodes). Such operation is time consuming error prone and costly, so providing an automation support is a pressing need. During this Reading group I will give be presenting a business process modelling abstraction technique that uses meronymy relation between task labels.

This paper has been chosen because of its tight relatedness to my PhD topic. I am currently working on managing variability of business process models at different abstraction levels.
Conclusion drawn from this work is that business process model abstraction (BPMA) is a possible and challenging task. Results of BPMA can be used from different perspectives such as understanding of business process models, reduction of process tasks, modelling variability…

In the paper, authors propose the use of meronymy relations between task labels, other techniques have been also experienced and attendees can propose and think about other kind of relations that can be used in such context allowing for opening eyes on other perspectives.



Date:

07th of December 2011


Original Paper:

www.springerlink...


Translating SPARQL to XQuery

based on Embedding SPARQL into XQuery/XSLT; Translating SPARQL and SQL to XQuery by Sven Groppe, Jinghua Groppe, Volker Linnemann, Dirk Kukulenz, Nils Hoeller, Christoph Reinke; Peter Fischer, Dana Florescu, Martin Kaufmann, Donald Kossmann

Stefan Bischof

Abstract

XML query languages have been around for much longer than RDF query languages. The most prominent query languages for XML and RDF are XQuery and SPARQL, respectively. Regarding query optimisation XQuery and its predecessors have been studied more extensively than SPARQL, leaving a severe performance gap between corresponding query engines. I will present two approaches of closing this gap by mapping SPARQL queries to XQuery queries. The paper by Groppe et al focuses on the actual query rewriting modeled after common database theory. The paper by Fischer et al aims at providing a common query engine for SPARQL, SQL, and XQuery by mapping both, SPARQL and SQL, to XQuery. I will compare the two approaches with regards to rewriting, supported language features, and benchmarks.

You will learn a bit about XML and RDF query languages and about their relation to each other. You will also hear about potential problems when benchmarking these systems. Sicne we tested both systems in a benchmark setup for XSPARQL, I can also give you some insights about weaknesses and advantages the actual implementation.

We chose the papers because of their relevance to my PhD topic, being XSPARQL optimisation--partly implementing these approaches can help us to improve performance of the XSPARQL prototype. We learned from the paper that even heavy operations on the query can be very helpful for overall query performance and for the query author.



Date:

23rd of November 2011


Original Paper:

dl.acm.org/citat...


Phrase-based statistical Machine Translation

based on Improving Statistical Machine Translation using Word Sense Disambiguation; Reassessment of the Role of Phrase Extraction in PBSMT by Marine Carpuat, Dekai Wu; Francisco Guzman, Qin Gao, Stephan Vogel

Mihael Arcan

Abstract

 In this talk I will present two interesting topics in the machine translation. Since statistical machine translation is able to manipulate with parallel resources, the issue of aligning words or "phrases" is still present. In the first paper Guzman et al are showing, how to reassess the statical data to improve the translations. In the second paper Carpuat and Wu present new approaches which add semantics into the probability and decision pipeline of the translations process.

I will give a slight overview of machine translation and the required parallel data for this process. Both papers are showing the problems that we have to deal with, if we are trying to get better translations. Beside the phrase tables, the semantics is playing an important role by using surrounding words as context to disambiguate the words or phrases.



Date:

16th of November 2011


Original Paper:

acl.ldc.upenn.ed...


Subqueries in SPARQL

based on Subqueries in SPARQL by Renzo Angles, Claudio Gutierrez

Nuno Lopes

Abstract

The Subqueries functionality is a powerful feature which allows to enforce reuse, composition, rewriting and optimization in a query language. In this paper we perform a comprehensive study of the incorporation of subqueries into SPARQL. We consider several possible choices as suggested by the experience of similar languages, as well as features that developers are incorporating and/or experimenting with. Based on this study, we present an extension of SPARQL, with syntax and formal semantics, which incorporates all known types of subqueries in a modular fashion and preserves the original semantics.

Do you want to learn what you can do with nested queries in SPARQL 1.1? Do you want to find out what you can't do with these queries?
I'll give an overview of the current proposal for nested queries in the SPARQL 1.1 standard and then compare it to the approach presented in the paper showing what kind of queries are not allowed in SPARQL 1.1.



Date:

09th of November 2011


Original Paper:

ceur-ws.org/Vol-...


A Survey of RDF Stream Processing

based on See abstract by A. Bolles, M. Grawunder, J. Jacobi; D. F. Barbieri, D. Braga, S. Ceri, M. Grossniklaus; J. P. Calbimonte, O. Corcho, A. J. G. Gray; D. Anicic, P. Fodor, S. Rudolph, N. Stojanovic; D. Le Phuoc, M. Dao-Tran, J. Xavier Parreira, M. Hauswirth

Danh le Phuoc, Danh Le Phuoc

Abstract

There have been an increasing number of publications on processing RDF
Stream data. This talk gives a survey on most of the relevant papers
published on popular venues like ESWC, EDBT, ISWC, WWW from 2008 up till
now. If you're interested in processing highly dynamic data like sensor
readings, feed streams like Tweeter,Facebook, this talk will give you the
overview about why RDF stream, what have been done and what to expect in
years to come.  

This talk considers the following approaches:

1. Andre Bolles , Marco Grawunder  and Jonas Jacobi
Streaming SPARQL - Extending SPARQL to Process Data Streams
http://www.springerlink.com/content/04g46m7344213016/. ESWC 2008

2. Davide Francesco Barbieri,  Daniele Braga, Stefano Ceri and Michael
Grossniklaus. An execution environment for C-SPARQL queries
http://dl.acm.org/citation.cfm?id=1739095 . EDBT 2010

3. Jean Paul Calbimonte , Oscar Corcho,  Alasdair J. G. Gray.  Enabling
Ontology-based Access to Streaming Data Sources.
http://dl.acm.org/citation.cfm?id=1940289 . ISWC 2010

4. Darko Anicic,   Paul Fodor,    Sebastian Rudolph , Nenad Stojanovic.
EP-SPARQL: a unified language for event processing and stream reasoning.
http://dl.acm.org/citation.cfm?id=1963495.WWW 2011

5. Danh Le Phuoc , Minh Dao-Tran, Josiane Xavier Parreira , Manfred
Hauswirth. A Native and Adaptive Approach for Unified Processing of Linked
Streams and Linked Data.
http://www.springerlink.com/content/965ru6u811275241/. ISWC 2011



Date:

02nd of November 2011


DBpedia Spotlight: Shedding Light on the Web of Documents

based on DBpedia Spotlight: Shedding Light on the Web of Documents by Pablo N. Mendes, Max Jakob. Andres Garcia-Silva, Chistian Bizer

Laura Dragan

Abstract

Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.

This paper won best paper award at I-SEMANTICS. I attended the talk  and found
it very interesting, and entertaining (I hope to do it justice in my
presentation). It is very much related to my work in SemNotes, where I link
pieces of text to desktop resources in the same way that DBpedia spotlight
links text to DBpedia resources. It is also related in a way to another
direction of my work, where I connect desktop resources to the Web of Data
using Sindice - paper on this in the upcoming ISWC.



Date:

19th of October 2011


Original Paper:

www.wiwiss.fu-be...


Socialization in an Open Source Software Community: A Socio-Technical Analysis

based on Socialization in an Open Source Software Community: A Socio-Technical Analysis by Nicolas Ducheneaut

Aftab Iqbal

Abstract

Open Source Software (OSS) development is often characterized as a fundamentally
new way to develop software. Past analyses and discussions, however, have treated OSS projects and their organization mostly as a static phenomenon. Consequently, we do not know how these communities of software developers are sustained and reproduced over time through the progressive integration of new members. To shed light on this issue I report on my analyses of socialization in a particular OSS community. In particular, I document the relationships OSS newcomers develop over time with both the social and material aspects of a project. To do so, I combine two mutually informing activities: ethnography and the use of software specially designed to visualize and explore the interacting networks of human and material resources incorporated in the email and code databases of OSS. Socialization in this
community is analyzed from two perspectives: as an individual learning process and as a political process. From these analyses it appears that successful participants progressively construct identities as software craftsmen, and that this process is punctuated by specific rites of passage. Successful participants also understand the political nature of software development and progressively enroll a network of human and material allies to support their efforts. I conclude by discussing how these results could inform the design of software to support socialization in OSS projects, as well as practical implications for the future of these projects.

This talk will give an insight into how software developers join the OSS project and strengthen their relationship with the community members over the period of time. This talk would provide an insight into how communities are developed/evolved over time in the light of OSS projects. Should be a good talk to attend specially for those who are researching into how  communities are evolved over time.



Date:

12th of October 2011


Original Paper:

www.springerlink...


Analyzing User Modeling on Twitter for Personalized News Recommendations

based on Analyzing User Modeling on Twitter for Personalized News Recommendations / Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web by Fabian Abel, Qi Gao, Geert-Jan Houben,, Ke Tao

Fabrizio Orlandi

Abstract

In this talk we will present an analysis of some of the main challenges and solutions for user modelling on Twitter. The aim is to create accurate and complete user profiles of interests and activities of users by analysing their tweets posted on a microblogging site. The generated profiles can then be used for different personalisation purposes such as recommendations, personalised search, and adaptive systems.
In this presentation we focus on a specific use-case: news recommendations.
We showcase a framework for user modelling on Twitter implementing semantic enrichment of tweets and different user profiling strategies (hashtag- entity- or topic-based with different temporal dynamics).
Then we describe an analysis and evaluation of the performance of the user modelling strategies in context of a personalised news recommendation system.

The work we present was awarded with a best paper award at UMAP 2011 and the publications by Abel et al. on the same topic have been quite successful this year (ESWC2011, UMAP2011, WebSci2011 and ISWC2011).
The authors present the first large-scale study on user modelling based on Twitter activities and moreover explore how different user models impact the accuracy of recommendations.
This talk will provide an interesting overview of user modelling techniques and of semantic enrichment of posts on the Social Web. This talk is particularly relevant to students and researchers interested in personalisation, recommendations and the Social Semantic Web in general.



Date:

05th of October 2011


Original Paper:

www.springerlink...


Reactive Answer Set Programming

based on Reactive Answer Set Programming by Martin Gebser, Torsten Grote, Roland Kaminski, Torsten Schaub

Philipp Obermeier

Abstract

In this talk I will introduce an approach for Reactive
Answer Set Programming. Answer Set Programming is a logic
programming fragment that provides a model theoretic underpinning for
non-monotonic Datalog. Reactive Answer Set Programming is a special
field of Answer Set Programming that aims for online reasoning about
dynamic systems running in changing environments. Firstly, I will talk
about the foundations of this approach, specifically, Incremental
Answer Set Programming which provides the means for cost-effective
reasoning on modular incremental theories. Based on that I will later
discuss how this concept is extended towards online processing in
terms of stream-driven grounding and solving. To illustrate the
intuition behind those theories I will provide supporting examples and
showcase oclingo, an implementation of this proposal.

I learned about this work when I was investigating formal foundations
for Stream Reasoning with Answer Set Programming. The authors provide
not only a theoretical description, but also a well-documented
open-source implementation, called oclingo. The latter allowed me to
gain valuable hands-on experience for reasoning upon streaming
data. As semantically augmented data streams becoming more and more
prevalent on the Web and in sensor networks this talk provides hence
valuable insights for PhD students in DERI.



Date:

28th of September 2011


Original Paper:

www.cs.uni-potsd...


Tag Based Social Interest Discovery

based on Tag Based Social Interest Discovery by Li, X., Guo, L., Zhao, Y.E.

Vinod Hedge

Abstract

This paper analyzes the tag usage on the web and the relevance of that to infer about user interests. It discusses the effectiveness of tags to abstract the concepts and to cover the top keywords of  documents viewed by users on the web. It proposes a technique to model web users in the absence of social network information of the web users.
If you are interested in User Modelling or Recommender Systems, this is a good paper as a reference.


Date:

21st of September 2011


A User-Oriented Model for Expert Finding

based on A User-Oriented Model for Expert FindingTypically expert finding systems focus on building models of expertise from the content produced by experts and retrieve the best experts for a given topic without considering any information about the person doing by Elena Smirnova, Krisztian Balog

Georgeta Bordea

Abstract

Typically expert finding systems focus on building models of expertise from the content produced by experts and retrieve the best experts for a given topic without considering any information about the person doing the search. This paper argues that in a realistic expertise seeking scenario the knowledge of the user and their relation with the experts plays an important role.

This paper won the best paper award at ECIR this year and is co-authored by one of the most cited authors in Expert Finding and the organiser of several TREC Entity tracks, Krisztian Balog. The talk is of interest not only for researchers in Expert Finding but also for people working on Personalisation and Social Networks.



Date:

14th of September 2011


Using Controlled Natural Language and First Order Logic to Improve E-Consultation Discussion Forums

based on A Framework for Enriched, Controlled On-line Discussion Forums for e-Government Policy-making by Adam Wyner, Tom van Engers

Jodi Schneider

Abstract

Online consultation has become a common way to increase
participation in policy-making, yet ensuring that citizens' input is
adequately understood remains challenging, since standard discussion
forums offer few possibilities for structuring posts based on their
meaning. This reading group will discuss efforts to structure the forums
while retaining ease-of-use, based on a trio of papers from the IMPACT
project [1]. The underlying idea is to construct possible policies by
finding the maximal set of consistent viewpoints, since a policy
statement must avoid internally contradictions (e.g. be consistent).

The first paper is a requirements analysis, providing a justification
and overview of the work they propose: adding structure with controlled
natural language, argumentation frameworks, and user input of term
relationships (e.g. "contradicts, "is premise of", or "is an exception
to"). It also introduces a running example: 16 propositional statements,
extracted from an online debate "Should people be paid to recycle?" [2],
such as "Paying tax for garbage is unfair." or "No householder should
pay tax for the garbage which the householder throws away."  

The statements from the running example are analyzed in the second
paper, using a controlled natural language, Attempto Controlled English
(ACE) [3], which are then modified in order to ensure that ACE can parse
the sentences, and that they yield the intended interpretation. In order
to derive possible policies, based on the sentences, the authors use a
first-order inference engine to analyze the consistency of sets of
sentences. They explore some of the problems they encountered.

Extending that work, the third paper envisions using ACEWiki [4] to
construct a consistent knowledge base incrementally, based on users'
input and using the Pellet inference engine to test for consistency. It
presents an argumentation framework for the recycling debate--a graph
showing the relationships between the 16 sentences of the running
example--in order to find the maximal set of consistent propositions,
i.e. a possible policy.

This is a very practical approach to overcoming the knowledge
acquisition bottleneck while retaining ease-of-use, allowing information
to be added by a wide range of non-experts. Controlled natural
languages, first order logic reasoners, semantic wikis, and
argumentation frameworks will be briefly introduced.



Date:

07th of September 2011


Semantic Network Access Control Policy Model and Enforcement Framework

based on Access Control Policies for Semantic Networks by Tatyana Ryutov, Tatiana Kichkaylo, Robert Neches

Sabrina Kirrane

Abstract

Over the past two decades much research has gone into the use of semantic technology and linked data for integration, search and more recently information analysis and knowledge discovery. However, very little research has been done the application of access control to semantic networks. This presentation, which examines recent work in this area, is based primarily on a paper “Access Control Policies for Semantic Networks” by Tatyana et al however, it touches on previous related work “SFINKS: Secure Focused Information, News, and Knowledge Sharing” which is cited in the main paper. The authors identify a number of challenges with respect the specification and enforcement of access control policies in semantic networks. They highlight the need to consider the semantic relationships between the various nodes, links and permissions in the semantic network access control model and demonstrate how access control management can be simplified through propagation of policies. They propose an access control model for a semantic network which they represent formally as many-sorted first order logic. They devise a policy retrieval and evaluation algorithm which handles conflicts arising from multiple policies. Finally, they evaluate their approach using a set of Java libraries and a methodology for constructing distributed semantic network based applications in which users may restrict visibility of resources they control. 


Date:

27th of July 2011


Architecture for integrating BPM with sensor networks through Complex Event Processing

based on Leveraging Business Process Management through Complex Event Processing for RFID and Sensor Networks by Pablo Rosales, Kyuhyup Oh, Kyuri Kim,, Jae-Yoon Jung

Feng Gao

Abstract

Today’s dynamic and competitive business environment urges companies to transform themselves into real-time enterprises. The realization of enterprise-wide real-time monitoring and rapid decision making requires not only the network of information systems, but also a network of physical objects, the so-called Internet of Things. This paper proposes an architecture to achieve the real-time enterprises through CEP technique that resolves the granular mismatch of business events and sensor events.

Researcher working on domains related to CEP, BPM and WSN may find this paper helpful. It provides an overview of the architecture that combines these techniques as well as an application scenario to demonstrate the feasibility.



Date:

20th of July 2011


Original Paper:

ieeexplore.ieee....


SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Reasonig

based on SOFIE: A Self-Organizing Framework for Information Extraction by Fabian M. Suchanek, Mauro Sozio, Gerhard Weikum

Tobias Wunner

Abstract

The creation of new knowledge in the Semantic Web is more and more depending on a automatic knowledge enrichment processes, such semi-structural Information Extraction (IE) in the example of the creation of DBPedia from Wikipedia. To further improve knowledge coverage IE must also consider non-structural plain natural language text resources. Here SOFIE offers a novel approach to IE which can consistently enrich semantic models from text sources by unifying pattern matching, entity disambiguation and reasoning in the IE process.
 


Date:

22nd of June 2011


Original Paper:

suchanek.name/wo...


A Similarity Measure based on Semantic and Linguistic information

based on "A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness" and "A syntax-based measure for short-text semantic similarity" by Giuseppe Pirro, Jerome Euzena - Jesús Oliva, José Ignacio Serrano, María Dolores del Castillo, Ángel Iglesias

Nitish Aggarwal

Abstract

Similarity measures between two text values are becoming a key point in the field of semantic search, semantic annotation, ontology alignment and other related fields. In this reading group, two papers will be discussed : Semantic similarity between two ontology concepts; and syntax based semantic similarity for short text. In the first paper authors present a framework, which maps the feature-based model of similarity into the information theoretic model, where the whole set of semantic relations defined in an ontology, accounts in a feature. Also they propose a new model, called Extended Information Content, which is not focusing only on "isa" relation rather all relations beyond the inheritance (e.g., that car has part-of engine or that bicycle has as part-of sprocket). 

The second paper presents the syntax-based similarity using syntactic information obtained through deep parsing process.

If you are interested in semantic search, semantic annotation, semantic model matching and others related research, these two papers are interesting to discuss the semantic similarity and relatedness, which is the key for these fields.



Date:

15th of June 2011


Original Paper:

simlibrary.files...


How Useful are Your Comments? - Analyzing and Predicting YouTube Comments and Comment Ratings

based on How Useful are Your Comments? - Analyzing and Predicting YouTube Comments and Comment Ratings by S. Siersdorfer, S. Chelaru, W. Nejdl,, J.S. Pedro

Smitashree Choudhury

Abstract

An analysis of the social video sharing platform YouTube reveals a high amount of community feedback through comments for published videos as well as through meta ratings for these comments. In this paper, we present an in-depth study of commenting and comment rating behavior on a sample of more than 6 million comments on 67,000 YouTube videos for which we analyzed dependencies between comments, views, comment ratings and topic categories. In addition, we studied the influence of sentiment expressed in comments on the ratings for these comments using the SentiWordNet thesaurus, a lexical WordNet-based resource containing sentiment annotations. Finally, to predict community acceptance for comments not yet rated, we built different classifiers for the estimation of ratings for these comments. The results of our large-scale evaluations are promising and indicate that community feedback on already rated comments can help to filter new unrated comments or suggest particularly useful but still unrated comments.


Date:

08th of June 2011


Original Paper:

*Link:*portal.ac...


Disassortative mixing in online social networks

based on Disassortative mixing in online social networks by Hai-Bo Hu, Xiao-Fan Wang

Samantha Lam

Abstract

Assortative mixing is a type of measure of the degree distribution of a network. Recent research has commonly found that this pattern is different for biological/ecological networks (often disassortative) and social networks such as coauthorship (often assortative). This paper shows that an online social network exhibits a change in assortativity, from the conventional 'real' social network to online social network's one and attempts to model and explain this phenomenon.


Date:

01st of June 2011


Original Paper:

iopscience.iop.o...


Semantics, Sensors and the Social Web: the Live Social Semantic experiment

based on Social dynamics in conferences: analyses of data from the Live Social Semantics application by Alain Barrat, Ciro Cattuto, Martin Szomszor, Wouter Van den Broeck, Harith Alani

Myriam Leggieri

Abstract

There exists a strong interdependencies among dynamics and social interactions on the online world and the ones taking place in the real  world but still, until recently, there has been a lack of real data spanning across online and offline realities. The Live Social Semantics application that I will present, overcomes this gap. It integrates data about people from (a) their online social networks and tagging activities, (b) their publications and co-authorship networks from semantic repositories, (c) their real-world face-to-face contacts collected via a network of wearable active sensors. The two papers that I will present, explain the architecture of the Live Social Semantic application, investigate the data collected by it during its deployment at three major conferences. In particular the analysis stresses the influence of various personal properties (e.g. seniority, conference attendance) on social networking patterns.



Date:

25th of May 2011


Original Paper:

www.pdfdownload....


An Evaluation of Approaches to Federated Query Processing over Linked Data

based on An Evaluation of Approaches to Federated Query Processing over Linked Data by Peter Haase Tobias Mathäß Michael Ziller

Nur Aini Rakhmawati

Abstract

The Web has evolved from a global information space of linked documents to a web of linked data. The Web of Data enables answering complex, structured queries that could not be answered by a single data source alone. While the current procedure to work with multiple, distributed linked data sources is to load the desired data into a single RDF store and process queries in a centralized way against the merged data set, such an approach may not always be practically feasible or desired. 
In this paper, we analyze alternative approaches to federated query
processing over linked data and how different design alternatives affect
the performance and practicality of query processing. To this end, we
define a benchmark for federated query processing, comprising a
selection of data sources in various domains and representative queries. Using the benchmark, we perform experiments with different federation alternatives and provide insights about their advantages and disadvantages.


Date:

18th of May 2011


Original Paper:

portal.acm.org/c...


Semantic Privacy Preferences for the Social Web and SPoX: Skype Policy Extension

based on Guarding a Walled Garden - Semantic Privacy Preferences for the Social Web by Philipp Karger, Wolf Siberski

Owen Sacco

Abstract

In this reading group two papers will be discussed: a formal privacy policies model based on the Protune policy engine; and a tool for Skype for defining Protune policies. In the first paper, the authors claim that privacy preferences consist of mappings between groups of objects (such as "a wall post") or actions (such as "view a wall post") and groups of people that are allowed to access these objects. The authors use this claim to define their privacy settings formal model and implement such model using the Protune policy framework. Moreover, the authors also extend the OpenSocial Platform so that privacy settings are defined according to their model. Although in this model user groups have to be specifically defined, and also objects and actions are combined in the same category, this work reveals interesting methods on how to semantically define privacy settings. The second paper presents how priva
cy policies defined using Protune can be enforced in Skype. This paper explains interesting work how Semantic Web data can be used to create policies that control the behaviour in Skype. 


Date:

11th of May 2011


Original Paper:

www.l3s.de/~kaer...


Information Credibility on Twitter

based on Information Credibility on Twitter by Carlos Castillo, Marcelo Mendoza, Barbara Poblete

David Crowley

Abstract

We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally. On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources. We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.


Date:

04th of May 2011


Original Paper:

research.yahoo.c...


Topic-dependent sentiment analysis of financial blogs

based on Topic-dependent sentiment analysis of financial blogs by Neil O'Hare, Michael Davy, Adam Bermingham, Paul Ferguson, Padraic Sheridan, Cathal Gurrin,, Alan F. Smeaton

Brian Davis

Abstract

While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, 
from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full document classification and that word-based approaches perform better than sentence-based or paragraph-based approaches.


Date:

20th of April 2011


Material (Slides):

portal.acm.org/c...


Pay-as-you-go user feedback for dataspace systems

based on Pay-as-you-go user feedback for dataspace systems by Shawn R. Jeffery, Michael J. Franklin, Alon Y. Halevy

Umair ul Hassan

Abstract

Dataspaces provide a useful abstraction layer over group of
heterogeneous data sources, providing best-effort services for
management and use of data. Integration being one of the key elements of
dataspace support platform requires user involvement for defining or
checking complex semantic relationships. Maximizing utility of user
feedback while keeping user involvement at minimal level is fundamental
requirement. In this talk we will study a pay-as-you-go approach for
soliciting user feedback for schema and entity matching in data
integration.

Automated data integration from different sources, with overlaps,
generates uncertain results.  This paper demonstrates application of
decision theory for ordering user feedback, for both schema and entity,
matching candidates. Similar methods can be applied to other types of
uncertainty in integration results. Participants will learn about an
important dimension of dataspaces i.e. user feedback, and its use case
application to data integration tasks.



Date:

13th of April 2011


Original Paper:

portal.acm.org/c...


The Power and Limits of Relational Technology In the Age of Information Ecosystems

Prof. Michael Brodie

Abstract

For over three decades relational technology has been the most efficient data management solution on the planet and the data management bedrock of business information processing. Due to increasingly sophisticated and unavoidable data modelling and integration requirements of today’s Information Ecosystems that exceed the modelling power of the relational data model, relational technology is less than optimal. Data modelers have known this for decades yet continue to design and integrate databases under assumptions that underlie the relational data model. Making the relational assumptions explicit assists in resolving inherent data integration challenges and can lead to more efficient design, development, and execution of relational data modelling and integration solutions.

The dramatic success of relational technology has propelled data modelling and management requirements beyond the modelling and processing capabilities of the relational technology. Our Digital Universe is no longer a semantically homogeneous set of a few databases but Information Ecosystems of 100s or 1,000s of semantically heterogeneous databases. As a result relational data integration solutions required for apparently inconsistent data descriptions of the same entity are developed manually with much effort and little guidance. Over a decade of experience with these issues arising in extremely large-scale applications reveals that the chief problems arise from the conflict between the inherent semantic heterogeneity of data to be integrated and the inherent semantic homogeneity of relational modelling tools. As semantic heterogeneity increases so do the consequent data integration challenges that decrease the efficiency and increase the cost of designing, developing, and executing relational integration solutions.

This talk focuses on inherent data integration challenges that arise from inconsistencies between relational data descriptions of the same entity that arise from differences in their corresponding ontologies. We investigate the power and limitations of the relational data model to model and execute the resulting relational data integration solutions and the logical fallacies of the underlying relational assumptions. We close by providing a simple conceptual basis for identifying and resolving inherent and complex data integration challenges. The conceptual basis, called the Shadow Approach, arises from the truths underlying Plato’s Cave.

Verizon Communications


Date:

05th of April 2011


The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems

based on The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems by David Luckham

Souleiman Hasan

Abstract

This book is considered the seminal work on complex event
processing (CEP). The term might be new to some of you but it relies
basically on a previous technology that has been there for quite a long
time which is event correlation. However, the book introduces new
applications and visions to the technology and tries to provide the
medium for connecting the whole enterprise in an event-driven
architecture where aggregated, high level, and more abstract events are
formed upon the occurrence of an event pattern. Such high level events
make sense to stakeholders in the upper management. I will try to
introduce the key concepts of CEP such as events, event cloud, streams,
temporal relationships and causality. I will talk about some of the
potential applications of CEP.

Whatever you are working with, you are likely to have seen the
information gap between different levels of granularities of your data
and the different interest by stakeholders in these levels. You are
likely to have thought about event-driven architectures and wondered
what applications they can serve. CEP has emerged almost in parallel
with the Semantic Web so I think it will be useful for all of us to
think of the potential overlap that can be researched between the two
technologies.



Date:

30th of March 2011


Original Paper:

www.amazon.com/P...


Community Detection in Graphs

based on Community Detection in Graphs by Santo Fortunato

Vaclav Belak, Václav Belák

Abstract

With a plethora of available network data, the crucial task is to develop techniques for accurate and scalable analysis. Empirical analysis of complex networks ranging wide landscape of social, bibliographical, knowledge (e.g. RDF graphs), or biological networks has shown, that they typically have a community structure, i.e. it is possible to divide nodes of a network into sets with many links inside the same set and few links among different sets. Therefore, detection of communities is a crucial task of network analysis. Many of the traditional methods as well as newly-developed community detection methods have been used, but an overview of such methods sorting them into classes and discussing their pros and cons was missing. The paper I will present address this issue and provides and excellent overview of the current state-of-the-art in the field. I will briefly introduce the main classes of the community detection methods as presented in the paper, discuss their strengths and weaknesses regarding usability, scalability and accuracy, and finally I will conclude with my own experience with some of them.

Whether you work with RDF, social, sensor, or other graphs, you may find useful to detect communities in it in order to e.g. classify nodes, discover hidden structure, or to get a simplified picture of a large network. I will present the main classes of algorithms for community detection, identifying some of their pros and cons wrt. usability, accuracy, and scalability and finally I will conclude with my own experience with some of them. You will take home an idea what methods for detecting communities are out there and which one to apply on your data.



Date:

23rd of March 2011


Original Paper:

arxiv.org/abs/09...


Spreading activation in semantic networks

based on Information retrieval by constrained spreading activation in semantic networks by Paul R. Cohen, Rick Kjeldsen

Benjamin Heitmann

Abstract

GRANT is an expert system for finding sources of funding given research proposals. Its search method-constrained spreading activation—makes inferences about the goals of the user and thus finds information that the user did not explicitly request but that is likely to be useful. The architecture of GRANT and the implementation of constrained spreading activation are described, and grant's performance is evaluated.

This paper nicely demonstrates that most of the concepts which form the foundation for the Semantic Web, Linked Data and the Web of Data are based on a very long heritage of AI research.
While we try to sell Semantic Web technologies as bleeding edge technology of the future, the core concepts are quite old. On the other hand, this shows the potential of applying and evaluating old techniques on new data which was not available until a few years ago.



Date:

16th of March 2011


Original Paper:

linkinghub.elsev...


Personalized Web Exploration with Task Models

based on Personalized Web Exploration with Task Models by Jae-wook Ahn, Peter Brusilovsky, Daquing He, Jonathan Grady, Qi Li

Ioana Hulpus

Abstract

Personalized Web search has emerged as one of the hottest topics for both the Web industry and academic researchers. However, the majority of studies on personalized search focused on a rather simple type of search, which leaves an important research topic – the personalization in exploratory  searches – as an under-studied area.
In this paper, we present  a study of personalization in taskbased information exploration using a system called TaskSieve.
TaskSieve is a Web search system that utilizes a relevance feedback based profile, called a “task model”, for personalization. Its innovations include flexible and  user controlled integration of queries and task models, task-infused text snippet generation, and on-screen visualization of task models. Through an empirical study using human subjects conducting task-based exploration searches, we demonstrate that TaskSieve pushes significantly more relevant documents to the top of search result lists as compared to a traditional search system. TaskSieve helps users select significantly more accurate information for their tasks, allows the users to do so with higher productivity, and is viewed more favorably by subjects under several usability related characteristics.


Date:

09th of March 2011


Original Paper:

www2008.org/pape...


E-Participation: Success or Failure?

based on Characterizing E-Participation in Policy Making by Ann Macintosh

Lukasz Porwol

Abstract

This paper argues the urgent need to better understand the e-democracy pilots that have taken place so far and that are currently being developed. It addresses the issues of what should be characterized in e-democracy pilots so as to better identify types of citizen participation exercises and the appropriate technology to support
them, as such it offers an analytical framework for electronic participation. Over the last decade there has been a gradual awareness of the need to consider the innovative application of ICTs for participation that enables a wider audience to contribute to democratic debate and where contributions themselves are broader and deeper. This awareness has resulted in a number of isolated e- democracy pilots and research studies.
It is important to consolidate this work and characterizes the level of participation, the technology used, the stage in the policy-making
process and various issues constraints, including the potential benefits.


Date:

02nd of March 2011


From databases to dataspaces: wearing the linked data goggles

based on From databases to dataspaces: a new abstraction for information management by Michael Franklin, Alon Halev, David Maier

Jürgen Umbrich

Abstract

The development of relational database management systems served to focus the data management community for decades, with spectacular results. In recent years, however, the rapidly-expanding demands of "data everywhere" have led to a field comprised of interesting and productive efforts, but without a central focus or coordinated agenda. The most acute information management challenges today stem from organizations (e.g., enterprises, government agencies, libraries, "smart" homes) relying on a large number of diverse, interrelated data sources, but having no way to manage their dataspaces in a convenient, integrated, or principled fashion. This paper proposes dataspaces and their support systems as a new agenda for data management. This agenda encompasses much of the work going on in data management today, while posing additional research objectives.


Date:

23rd of February 2011


Original Paper:

portal.acm.org/c...


Real-time image deconvolution on the GPU

based on Real-time image deconvolution on the GPU by James T. Klosowski, Shankar Krishnan

Michael Sherry

Abstract

Two-dimensional image deconvolution is an important and well-studied problem with applications to image deblurring and restoration. Most of the best deconvolution algorithms use natural image statistics that act as priors to regularize the problem. Recently, Krishnan and Fergus provide a fast deconvolution algorithm that yields results comparable to the current state of the art. They use a hyper-Laplacian image prior to regularize the problem. The resulting optimization problem is solved using alternating minimization in conjunction with a half-quadratic penalty function. In this paper, we provide an efficient CUDA implementation of their algorithm on the GPU. Our implementation leverages many well-known CUDA optimization techniques, as well as several others that have a significant impact on this particular algorithm. We discuss each of these, as well as make a few observations regarding the CUFFT library. Our experiments were run on an Nvidia GeForce GTX 260. For a single channel image of size 710 x 470, we obtain over 40 fps, while on a larger image of size 1900 x 1266, we get almost 6 fps (without counting disk I/O). In addition to linear performance, we believe ours is the first implementation to perform deconvolution at video rates. Our running times also demonstrate that our GPU implementation is over 27 times faster than the original CPU implementation.


Date:

16th of February 2011


Original Paper:

spiedigitallibra...


Static, adaptive and adaptable user interfaces: lessons from menu research

based on several papers from conferences and journals such as CHI, B&IT, TOCHI, IJCAI by Greenberg S. et al., Sears A. et al., Findlater L. et al., Kaptelining V. et al

Krystian Samp

Abstract

I will present and discuss some empirical findings regarding static, adaptive and adaptable menus. These findings shed some light on the topic of customizable user interfaces. The promise of tailored user experience seem to be difficult to fulfil.

References:

Greenberg, S., and Witten, I. Adaptive personalized interfaces: A question of viability. Behaviour and Information Technology, 4(1) (1985), 31-45.

Sears, A., and Shneiderman, B. (1994). Split menus: effectively using selection frequency to organize menus. ACM TOCHI, 1(1), 27 - 51.

Findlater, L., & McGrenere, J. (2004). A comparison of static, adaptive, and adaptable menus. CHI (Vol. 6, pp. 89-96). New York, New York, USA: ACM Press. doi: 10.1145/985692.985704.

Kaptelinin, V. (1993). Item recognition in menu selection. CHI (pp. 183-184). New York, New York, USA: ACM Press. doi: 10.1145/259964.260196.



Date:

09th of February 2011


How we use the Web - Empirical Studies

based on a collection of several papersNot quite the average: An empirical study of web use by Eelco Herder, Harald Weinreich, Hartmut Obendorf, Matthias Mayer

Jacek Jankowski

Abstract

Empirical studies are valuable sources of knowledge on how users interact with Web-based systems, thus, during the reading group I will provide an overview and discussion of several field studies that provide insights on how the Web is used in daily life, the tasks that users carry out and the usability issues that users perceive. I will describe field studies from two categories: observational short-term studies, which provide data on the users and their tasks as collected through screen captures, video recording, diaries or questionnaires [1,2], and click-through long-term studies, which provide quantitative data on the actual user navigation behavior [3,4,5,6]. For a more complete and detailed description of how users interact with the Web see [7].

[1] Michael D. Byrne, Bonnie E. John, Neil S. Wehrle, and David C. Crow. The tangled web we wove: a taskonomy of www use. In CHI. ACM, 1999.
[2] Abigail J. Sellen, Rachel Murphy, and Kate L. Shaw. How knowledge workers use the web. In CHI. ACM, 2002.
[3] Lara D. Catledge and James E. Pitkow. Characterizing browsing strategies in the world-wide web. Comput. Netw. ISDN Syst., 1995.
[4] Linda Tauscher and Saul Greenberg. How people revisit web pages: empirical
ndings and implications for the design of history systems. Int. J. Hum.-Comput. Stud., 1997.
[5] Cockburn and Bruce McKenzie. What do web users do? an empirical analysis of web use. Int. J. Hum.-Comput. Stud., 2001.
[6] Harald Weinreich, Hartmut Obendorf, Eelco Herder, and Matthias Mayer. Not quite the average: An empirical study of web use. ACM Trans. Web, 2008.
[7] Eelco Herder. Forward, back and home again: analyzing user behavior on the web. PhD thesis, University of Twente, Twente, The Netherlands, 2006.



Date:

02nd of February 2011


Evidence of Quality of Textual Features on the Web 2.0

based on Evidence of Quality of Textual Features on the Web 2.0 by Flavio Figueiredo, Fabiano Belém, Henrique Pinto, Jussara Almeida, Marcos Gonçalves, David Fernandes, Edleno Moura, Marco Cristo

Sheila Kinsella

Abstract

The growth of popularity of Web 2.0 applications greatly increased the amount of social media content available on the Internet. However, the unsupervised, user-oriented nature of this source of information, and thus, its potential lack of quality, have posed a challenge to information retrieval (IR) services. Previous work focuses mostly only on tags, although a consensus about its effectiveness as supporting information for IR services has not yet been reached. This paper aims at assessing the relative quality of distinct textual features available on the Web 2.0. Towards this goal, they analyzed four features (title, tags, description and comments) in four popular applications (CiteULike, Last.FM, Yahoo! Video, and Youtube) to assess their effectiveness in improving classification.


Date:

24th of November 2010


Original Paper:

portal.acm.org/c...


Ontology-based information extraction - An Overview

based on Ontology-Based Information Extraction: An Introduction and a Survey of Current Approaches by Wimalasuriya, Daya, Dou, Dejing

Tobias Wunner

Abstract

Information Extraction aims to retrieve certain types of information from natural language text by processing them automatically. For example, an information extraction system might retrieve information about geopolitical indicators of countries from a set of web pages while ignoring other types of information. Ontology-based information extraction has recently emerged as a subfield of information extraction. Here, ontologies - which provide formal and explicit specifications of conceptualizations - play a crucial role in the information extraction process. Because of the use of ontologies, this field is related to knowledge representation and has the potential to assist the development of the Semantic Web. In this paper, we provide an introduction to ontology-based information extraction and review the details of different ontology-based information extraction systems developed so far. We attempt to identify a common architecture among these systems and classify them based on different factors, which leads to a better understanding on their operation. We also discuss the implementation details of these systems including the tools used by them and the metrics used to measure their performance. In addition, we attempt to identify the possible future directions for this field.




Date:

10th of November 2010


Material (Slides):

www.slideshare.n...


Original Paper:

citeseerx.ist.ps...


Evaluating recommendation systems

based on Evaluating Recommendation Systems by Guy Shani, Asela Gunawardana

Benjamin Heitmann


Date:

03rd of November 2010


Material (Slides):

research.microso...


Evaluations of Semantic Desktops

based on Evaluating Long-Term Use of the Gnowsis Semantic Desktop for PIM [1], Are semantic desktops better?: summative evaluation comparing a semantic against a conventional desktop[2] by Leo Sauermann, Dominik Heim [1], Thomas Franz, Ansgar Scherp, Steffen Staab [2]

Laura Dragan

Abstract

This presentation is about two evaluations Semantic Desktops. The evaluated SD are different, one is Gnowsis and the other is X-COSIM, but they are similar in characteristics and their goals are the same. The evaluation approaches are different, Gnowsis is evaluated with a long-term user study, while the X-COSIM tools are compared to a conventional desktop to see if they add benefits to the users. Here are the abstracts of the two papers:

The Semantic Desktop is a means to support users in Personal Information Management (PIM). Using the open source software prototype Gnowsis, we evaluated the approach in a two month case study in 2006 with eight participants. Two participants continued using the prototype and were interviewed after two years in 2008 to show their long-term usage patterns. This allows us to analyse how the system was used for PIM. Contextual interviews gave insights on behaviour, while questionnaires and event logging did not. We discovered that in the personal environment, simple has-Part and is-related relations are sufficient for users to file and re-find information, and that the personal semantic wiki was used creatively to note information.

Semantic desktop environments aim at improving the effectiveness and efficiency of users carrying out daily tasks within their personal information management (PIM) infrastructure. They support the user by transferring and exploiting the explicit semantics of data items across different PIM applications. Whether such an approach does indeed reach its aim of facilitating users' life and--if so--to which extent, however, remains an open question. In this paper we address this question with the first summative evaluation of a semantic desktop. We have developed a test environment to evaluate two semantic PIM applications against standard PIM tools. As result, we have found significant efficiency and satisfaction improvements for typical PIM tasks.


Date:

27th of October 2010


Material (Slides):

[1] portal.acm.o...


Rush: repeated recommendations on mobile devices.

based on Rush: repeated recommendations on mobile devices. by Dominikus Baur, Sebastian Boring, Andreas Butz

VinhTuan Thai

Abstract

We present rush as a recommendation-based interaction and visualization technique for repeated item selection from large data sets on mobile touch screen devices. Proposals and choices are intertwined in a continuous finger gesture navigating a two-dimensional canvas of recommended items. This provides users with more flexibility for the resulting selections. Our design is based on a formative user study regarding orientation and occlusion aspects. Subsequently, we implemented a version of rush for music playlist creation. In an experimental evaluation we compared different types of recommendations based on similarity, namely the top 5 most similar items, five random selections from the list of similar items and a hybrid version of the two. Participants had to create playlists using each condition. Our results show that top 5 was too restricting, while random and hybrid suggestions had comparable results.



Date:

20th of October 2010


Interacting with the SOA-Based Internet of Things: Discovery, Query, Selection, and On-Demand Provisioning of Web Services.

based on Interacting with the SOA-Based Internet of Things: Discovery, Query, Selection, and On-Demand Provisioning of Web Services. by Dominique Guinard, Vlad Trifa, Stamatis Karnouskos, Patrick Spiess, Domnic Savio

Raluca Zaharia

Abstract

The increasing usage of smart embedded devices in business blurs the line between the virtual and real worlds. This creates new opportunities to build applications that better integrate real-time state of the physical world, and hence, provides enterprise services that are highly dynamic, more diverse, and efficient. Service-oriented Architecture approaches traditionally used to couple functionality of heavyweight corporate IT systems, are becoming applicable to embedded real-world devices, i.e., objects of the physical world that feature embedded processing and communication. In such infrastructures, composed of large numbers of networked, resource-limited devices, the discovery of services and on-demand provisioning of missing functionality is a significant challenge. We propose a process and a suitable system architecture that enables developers and business process designers to dynamically query, select, and use running instances of real-world services (i.e., services running on physical devices) or even deploy new ones on-demand, all in the context of composite, real-world business applications.



Date:

13th of October 2010


Merging Business Process Models

based on Merging Business Process Models by M. La Rosa, M. Dumas, R. Kaarik, R. Dijkman

Wassim Derguech

Abstract

This paper addresses the following problem: given two or more business process models, create a process model that is the union of the process models given as input. In other words, the behavior of the produced process model should encompass that of the input models. The paper describes an algorithm that produces a single configurable process model from an arbitrary collection of process models. The algorithm works by extracting the common parts of the input process models, creating a single copy of them, and appending the differences as branches of configurable connectors. This way, the merged process model is kept as small as possible, while still capturing all the behavior of the input models. Moreover, analysts are able to trace back from which original model(s) does a given element in the merged model come from. The algorithm has been prototyped and tested against process models taken from several application domains.



Date:

06th of October 2010


The Structure of Information Pathways in a Social Commmunication Network

based on The Structure of Information Pathways in a Social Commmunication Network by Gueorgi Kossinets, Jon Kleinberg, Duncan Watts

Tara Hennessy

Abstract

Social networks are of interest to researchers in part because they are thought to mediate the flow of information in communities and organizations. Here we study the temporal dynamics of communication using on-line data, including e-mail communication among the faculty and staff of a large university over a two-year period. We formulate a temporal notion of "distance" in the underlying social network by measuring the minimum time required for information to spread from one node to another - a concept that draws on the notion of vector-clocks from the study of distributed computing systems. We find that such temporal measures provide structural insights that are not apparent from analyses of the pure social network topology. In particular, we define the network backbone to be the subgraph consisting of edges on which information has the potential to flow the quickest. We find that the backbone is a sparse graph with a concentration of both highly embedded edges and long-range bridges - a finding that sheds new light on the relationship between tie strength and connectivity in social networks.


Date:

22nd of September 2010


Original Paper:

portal.acm.org/f...


Agile Software Development: Adaptive Systems Principles and Best Practices

based on Agile Software Development: Adaptive Systems Principles and Best Practices by Peter Meso, Radhika Jain

Anna Dabrowska

Abstract

Today's environments of increasing business change require software development methodologies that are more adaptable. This article examines how complex adaptive systems (CAS) theory can be used to increase our understanding of how agile software development practices can be used to develop this capability. A mapping of agile practices to CAS principles and three dimensions (product, process, and people) results in several recommendations for “best practices” in systems development.


Date:

08th of September 2010


Original Paper:

teaching.fec.anu...


Supporting Collaborative Software Development through the Visualization of Socio-Technical Dependencies

based on Supporting Collaborative Software Development through the Visualization of Socio-Technical Dependencies by Cleidson R. B. de Souza, Stephen Quirk, Erik Trainer, David F. Redmiles

Aftab Iqbal

Abstract

One of the reasons large-scale software development is difficult is the number of dependencies that software engineers face. These dependencies create a need for communication and coordination that requires continuous effort by developers. Empirical studies, including our own, suggest that technical dependencies among software components create social dependencies among the software developers implementing those components. Based on this observation, we developed Ariadne, a plug-in for Eclipse. Ariadne analyzes software projects for dependencies and collects authorship information about projects relying on configuration management repositories. Ariadne can "translate" technical dependencies among components into social dependencies among developers. We have created visualizations to convey dependency information and the presence of coordination problems identified in our previous work. We believe the information conveyed in the visualizations will prove useful for software developers.



Date:

25th of August 2010


Original Paper:

portal.acm.org/c...


Tractable Query Answering over Ontologies with Datalog±

based on Tractable Query Answering over Ontologies with Datalog± by Andrea Cali, Georg Gottlob,, Thomas Lukasiewicz

Philipp Obermeier

Abstract

We present a family of expressive extensions of Datalog, called Datalog± , as a new paradigm for query answering over ontologies. The Datalog± family admits existentially quantified variables in rule heads, and has suitable restrictions to ensure highly efficient ontology querying. In particular, we show that query answering under so-called guarded Datalog± is PTIME-complete in data complexity, and that query answering under so-called linear Datalog± is in AC 0 in data complexity. We also show how negative constraints and a general class of key constraints can be added to Datalog± while keeping ontology querying tractable. We then show that linear Datalog± , enriched with a special class of key constraints, generalizes the well-known DL-Lite family of tractable description logics. Furthermore, the Datalog± family is of interest in its own right and can, moreover, be used in various contexts such as data integration and data exchange. This work is a short version of [A. Cali, G. Gottlob, and T. Lukasiewicz. A general Datalog-based framework for tractable query answering over ontologies. In Proc. PODS-2009. ACM Press, 2009.]


Date:

11th of August 2010


Original Paper:

ceur-ws.org/Vol-...


Identifying Features of Influential Blogs

based on Identifying the Influential Bloggers in a Community by Nitin Agarwal, Huan Liu, Lei Tang, Philip S. Yu

Sean Byrne

Abstract

The number of blogs on the internet is astronomical. Within the Blogosphere there are several sub-communities of blogs, all largely reporting on the same or similar subject matter. But all blogs are not equal, several blogs are simply echoing other blogs, adding little to the discussion. Using several bloggers that publish on a particular blog site as subjects, the authors attempt to develop a model which quantifies how "influential" each blogger is, using a set of intuitive properties.
My talk will give an overview of the paper, focusing on the blog properties the authors used as the basis for their model, their interpretation of 'influence' and future work arising from the paper.


Date:

28th of July 2010


Original Paper:

citeseerx.ist.ps...


Model-based approximate querying in sensor networks

based on Model-based approximate querying in sensor networks by Amol Deshpande, Carlos Guestrin, Samuel R. Madden, Joseph M. Hellerstein, Wei Hong

Danh Le Phuoc

Abstract

Declarative queries are proving to be an attractive paradigm for interacting with networks of wireless sensors. The metaphor that "the sensornet is a database" is problematic, however, because sensors do not exhaustively represent the data in the real world. In order to map the raw sensor readings onto physical reality, a model of that reality is required to complement the readings. In this article, we enrich interactive sensor querying with statistical modeling techniques. We demonstrate that such models can help provide answers that are both more meaningful, and, by introducing approximations with probabilistic confidences, significantly more efficient to compute in both time and energy. Utilizing the combination of a model and live data acquisition raises the challenging optimization problem of selecting the best sensor readings to acquire, balancing the increase in the confidence of our answer against the communication and data acquisition costs in the network. We describe an exponential time algorithm for finding the optimal solution to this optimization problem, and a polynomial-time heuristic for identifying solutions that perform well in practice. We evaluate our approach on several real-world sensor-network datasets, taking into account the real measured data and communication quality, demonstrating that our model-based approach provides a high-fidelity representation of the real phenomena and leads to significant performance gains versus traditional data acquisition techniques.


Date:

14th of July 2010


Original Paper:

www.springerlink...


Challenges in Building Large-Scale Information Retrieval Systems

based on Challenges in Building Large-Scale Information Retrieval Systems by Jeffrey Dean

Renaud Delbru

Abstract

Building and operating large-scale information retrieval systems used by hundreds of millions of people around the world provides a number of interesting challenges. Designing such systems requires making complex design tradeoffs in a number of dimensions, including (a) the number of user queries that must be handled per second and the response latency to these requests, (b) the number and size of various corpora that are searched, (c) the latency and frequency with which documents are updated or added to the corpora, and (d) the quality and cost of the ranking algorithms that are used for retrieval.
In this talk I'll discuss the evolution of Google's hardware infrastructure and information retrieval systems and some of the design challenges that arise from ever-increasing demands in all of these dimensions. I'll also describe how we use various pieces of distributed systems infrastructure when building these retrieval systems.
Finally, I'll describe some future challenges and open research problems in this area.


Date:

30th of June 2010


Material (Slides):

research.google....


Hybrid System Using Symbolic and Numeric Knowledge for Semantic annotation of MRI Images

based on A Hybrid System Using Symbolic and Numeric Knowledge for the Semantic Annotation of Sulco-Gyral Anatomy in Brain MRI Images by Ammar Mechouche, Xavier Morandi, Christine Golbreich,, Bernard Gibaud

Michael Sherry

Abstract

This paper describes an interactive system for the semantic annotation of brain MRI images.  The system uses both a numerical atlas and symbolic knowledge of brain anatomical structures depicted using the Semantic Web Standards.  An evaluation of the system was done using normal (healthy) and pathological cases.  The results obtained so far demonstrate that the system produces annotations with high precision and quality.


Date:

23rd of June 2010


Original Paper:

ieeexplore.ieee....


Using BM25F for Semantic Search

based on Using BM25F for Semantic Search by José R. Pérez-Agüera, Javier Arroyo, Jane Greenberg, Joaquin Perez-Iglesias, Victor Fresno

Nur Aini Rakhmawati Gunawan

Abstract

Information Retrieval (IR) approaches for semantic web search engines have become very populars in the last years. Popularization of different IR libraries, like Lucene, that allows IR implementations almost out-of-the-box have make easier IR integration in Semantic Web search engines. However, one of the most important features of Semantic Web documents is the structure, since this structure allow us to represent semantic in a machine readable format. In this paper we analyze the specific problems of structured IR and how to adapt weighting schemas for semantic document retrieval.



Date:

12th of May 2010


Annotated RDF

based on Annotated RDF by O. Udrea, D. R. Recupero,, V. S. Subrahmanian

Nuno Lopes

Abstract

There are numerous extensions of RDF that support temporal reasoning, reasoning about pedigree, reasoning about uncertainty, and so on. In this paper, we present Annotated RDF (or aRDF for short) in which RDF triples are annotated by members of a partially ordered set (with bottom element) that can be selected in any way desired by the user. We present a formal declarative semantics (model theory) for annotated RDF and develop algorithms to check consistency of aRDF theories and to answer queries to aRDF theories. We show that annotated RDF captures versions of all the forms of reasoning mentioned above within a single unified framework. We develop a prototype aRDF implementation and show that our algorithms work very fast indeed - in fact, in just a matter of seconds for theories with over 100,000 nodes.



Date:

05th of May 2010


Information Visualisation, GPU's and 3-D applications

based on Semantic Web and Information Visualization by Riccardo Albertoni, Alessio Bertone, Monica De Martino

Ronan Rochford

Abstract

The paper augments the results achieved in the research of Semantic Web technologies for semantic search application, underlining the importance of integrating Information Visualization into semantic web technologies. We present the results of a preliminary investigation of user needs in the activity of information search and an overview of the most used visualization tools in the web and tools for the ontology visualization. Our goal is to identify different types of information search problems and to demonstrate the advantages given by the adoption of information visualization.



Date:

21st of April 2010


Dynamics of Web documents and implication for incremental crawling and future change prediction

based on Estimating frequency of change; The Evolution of the Web and Implications for an Incremental Crawler by Junghoo Cho, Hector Garcia-Molina

Jürgen Umbrich

Abstract

We will summarise and discuss the work of Cho and Garcia-Molina about change frequencies of web documents and estimators for the average change rates [1,2]. We will see that there are many applications utilising the knowledge of change frequencies, and we will discuss in detail how this knowledge can improve incremental crawling. In detail, we will present common and established stochastic models/processes for Web document changes and present estimators that allow to model the average change rate of documents based on "complete" or "incomplete" change histories. These estimators are used to predict the likelihood of a change at a certain time in the future.


Date:

31st of March 2010


Development of a Controlled Natural Language Interface for Semantic MediaWiki

based on Development of a Controlled Natural Language Interface for Semantic MediaWiki by Smart, P., Bao, J., Braines, D., Shadbolt, N.

Brian Davis

Abstract

Semantic wikis support the collaborative creation, editing and utilization of semantically-enriched content, and they may therefore be well-suited to addressing problems associated with the limited availability of high-quality online semantic content. Unfortunately, however, many popular semantic wikis, such as Semantic MediaWiki (SMW), are not sufficiently expressive to support full-scale ontology authoring. Furthermore, the grounding of the Semantic Web in formal logic makes both the comprehension and production of ontological content difficult for many end-users. In order to address these issues, the expressivity of SMW was extended using a combination of semantic templates and a Web Ontology Language (OWL) meta-model. Semantic templates were also used to provide an ontology verbalization capability for SMW using the Rabbit Controlled Natural Language (CNL). The resulting system demonstrates how CNL interfaces can be implemented on top of SMW. The proposed solution introduces no changes to the underlying functionality of the SMW system, and the use of semantic templates as an ontology verbalization solution means that end-users can exploit all the usual features of conventional wiki systems to collaboratively create new CNL verbalization capabilities.



Date:

24th of March 2010


Original Paper:

eprints.ecs.soto...


Managing Information in Online Product Review Communities: Two Approaches

based on Managing Information in Online Product Review Communities: A Comparison of Two Approaches by Jahna Otterbacher

Maciej Dabrowski

Abstract

Virtual communities often suffer from a number of problems, including questionable information quality and information overload, which threaten their utility and stability. To address this, social filtering techniques may be used, in which users rate the postings, guiding others to the important ones. This method is contrasted to information retrieval techniques, in which intrinsic properties of texts, such as length or keywords, are used to rank them by perceived relevance to a topic. Each approach has advantages and disadvantages. Social navigation assumes that users actively rate messages, however, soliciting sufficient participation is a known challenge. Additionally, what is interesting for one user may not be for others. Currently, we compare these approaches in the context of an e-commerce product review forum at Amazon.com. We find that while a significant proportion of reviews go unrated, these reviews are typically of low quality. Interestingly, we also find that the rankings produced using reader-assigned “helpful votes” are correlated to the rankings assigned by some simple information retrieval algorithms. The conclusion is that a number of approaches for filtering product reviews could effectively be used in such online communities in order to accommodate user preferences, and thus, in reinforcing the utility of the community.


Date:

10th of March 2010


Original Paper:

is2.lse.ac.uk/as...


Sketching User Experiences - getting the design right and the right design

based on Sketching User Experiences - getting the design right and the right design by Bill Buxton

Krystian Samp

Abstract

There is almost a fervor in the way that new products, with their rich and dynamic interfaces, are being released to the public-typically promising to make lives easier, solve the most difficult of problems, and maybe even make the world a better place. The reality is that few of these products survive, much less deliver on their promise. The folly? An absence of design, and an over reliance on just technology and/or traditional practice.
We need design. But design as described here depends on the skills of a number of different communities-each essential, but on their own, none sufficient. In this rich ecology, designers are faced with new challenges-challenges that build on, rather than replace, existing skills and practice.
Sketching User Experiences approaches design and design thinking as something distinct that needs to be better understood by both designers and the people with whom they need to work in order to achieve success with these new types of products and systems. So while the focus is on design, the approach is holistic. Hence, the book speaks to designers, usability specialists, people from HCI, product managers and business executives. There is an emphasis on balancing the back-end concern with usability and engineering excellence (getting the design right) with an up-front investment in sketching and ideation (getting the right design). Overall, the objective is building the notion of informed design, molding emerging technology into a form that serves our society and reflects its values. Grounded in both practice and scientific research, Bill Buxton's engaging work aims to spark the imagination while encouraging the use of new techniques, breathing new life into user experience design.


Date:

24th of February 2010


Crowd-Coordination: 2 Studies of Wikipedia Talk Pages

based on 2 papers by several authors

Jodi Schneider

Abstract

Wikipedia, the 6th most visited website in the world according to Alexa, is a success story in massive collaboration. Despite tremendous growth and high traffic, Wikipedia is resilient to malicious editing. The fastest growing areas of Wikipedia are devoted to coordination and organization. Based on two qualitative and quantitative studies of Wikipedia, we describe the coordination work happening in Talk pages, and how volunteer editors use these pages to plan editing, enforce standards, and ensure the quality of information in this online encyclopedia.

Based on:

[1] F.B. Viegas, M. Wattenberg, J. Kriss, and F.V. Ham, “Talk Before You Type: Coordination in Wikipedia,” 40th Annual Hawaii International Conference on System Sciences, 2007. HICSS 2007. pp. 78-87.
http://doi.ieeecomputersociety.org/10.1109/HICSS.2007.511

[2] B. Stvilia, M.B. Twidale, L.C. Smith, and L. Gasser, “Information Quality Work Organization in Wikipedia,” Journal of the American Society for Information Science and Technology, vol. 59, 2008, pp. 983-1001.
http://dx.doi.org/10.1002/asi.20813



Date:

10th of February 2010


Populating the Semantic Web with Relations from Text

based on "Populating the Semantic Web with Relations from Text" and "Populating the Semantic Web—Combining Text and Relational Databases as RDF Graphs" by Kate Byrne, Claire Grover, Ewan Klein

Behrang Qasemizadeh

Abstract

Great hopes are cherished for the Semantic Web. It is intended to make linking data as easy as HTML makes linking Web documents. The basic information is in easily searchable databases, but huge amounts of content are locked in text that is difficult to query systematically. Extracting coherent facts from related but separate collections is another difficulty. The papers discuss how the semantic web can offer solution to both of these problems, and suggests RDF graph as unified format for combining all relevant contents. The focus will be more on the process of text to RDF.


Date:

03rd of February 2010


Mining Web Data for Competency Management

based on Mining Web Data for Competency Management by J.Zhu, A.L.Gonçalves, V.S.Uren, E.Motta, R.Pacheco

Georgeta Bordea

Abstract

We present CORDER (COmmunity Relation Discovery by named Entity Recognition) an un-supervised machine learning algorithm that exploits named entity recognition and co-occurrence data to associate individuals in an organization with their expertise and associates. We discuss the problems associated with evaluating unsupervised learners and report our initial evaluation experiments.


Date:

27th of January 2010


Building the Santa Fe Artificial Stock Market

based on Building the Santa Fe Artificial Stock Market by LeBaron, B.

Daniel Paraschiv

Abstract

This short summary presents an insider's look at the construction of the Santa Fe artificial stock market. The perspective considers the many design questions that went into building the market from the perspective of a decade of experience with agent-based financial markets. The market is assessed based on its overall strengths and weaknesses.



Date:

20th of January 2010


Original Paper:

www.citeulike.or...


Pioneering Theories for Diffusion of Information in Networks

based on 2 papers by various authors

Partha Basuchowdhuri

Abstract

based on:
[1] Frank M. Bass. A New Product Growth for Model Consumer Durables. In Management Science, Vol. 15, No. 5, Theory Series (Jan., 1969), pp. 215-227.
[2] Mark S. Granovetter. The Strength of Weak Ties. In The American Journal of Sociology, Vol. 78, No. 6 (May, 1973), pp. 1360-1380.

Abstract:
The talk is presenting two key theories [1][2] that have made significant contribution to the understanding of the spread of an information or an influence within a network.

Abstract of [1]:
A growth model for the timing of initial purchase of new products is developed and tested empirically against data for eleven consumer durables. The basic assumption of the model is that the timing of a consumer's initial purchase is related to the number of previous buyers. A behavioral rationale for the model is offered in terms of innovative and imitative behavior. The model yields good predictions of the sales peak and the timing of the peak when applied to historical data. A long-range forecast is developed for the sales of color television sets.

Abstract of [2]:
Analysis of social networks is suggested as a tool for linking micro and macro levels of sociological theory. The procedure is illustrated by elaboration of the macro implications of one aspect of small-scale interaction: the strength of dyadic ties. It is argued that the degree of overlap of two individuals' friendship networks varies directly with the strength of their tie to one another. The impact of this principle on diffusion of influence and information, mobility opportunity, and community organization is explored. Stress is laid on the cohesive power of weak ties. Most network models deal, implicitly, with strong ties, thus confining their applicability to small, well-defined groups. Emphasis on weak ties lends itself to discussion of relations between groups and to analysis of segments of social structure not easily defined in terms of primary groups.

Papers available at:
[1] http://math.asu.edu/~dieter/courses/mat451/fall08/References/bass_1969.pdf
[2] http://www.sna.pl/teksty/Granovetter73.pdf


Date:

06th of January 2010


Mapping the World's Photos

based on Mapping the World's Photos by David Crandall, Lars Backstrom, Daniel Huttenlocher, Jon Kleinberg

Gabriela Vulcu

Abstract

We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.



Date:

16th of December 2009


Original Paper:

www.cs.cornell.e...


Social phishing

based on Social phishing by Tom N. Jagatic, Nathaniel A. Johnson, Markus Jakobsson, Filippo Menczer

Slawomir Grzonkowski

Abstract

Phishing is a form of deception in which an attacker attempts to fraudulently acquire sensitive information from a victim by impersonating a trustworthy entity. Phishing attacks typically employ generic “lures.” For instance, a phisher misrepresenting himself as a large banking corporation or popular online auction site will have a reasonable yield, despite knowing little to nothing about the recipient. In a study by Gartner Group [9], about 19% of all those surveyed reported having clicked on a link in a phishing email message, and 3% admitted to giving up financial or personal information. The research project described here was designed to provide us with a baseline success rate for individual phishing attacks, and was, when it was performed in 2005, the first study to achieve this goal.


Date:

09th of December 2009


Original Paper:

doi.acm.org/10.1...


Thinking inside the box: System-level failures of Tamper Proofing

based on Thinking inside the box: System-level failures of Tamper Proofing by Saar Drimer, Steven J. Murdoch, Ross Anderson

Oana-Elena Ureche

Abstract

PIN entry devices (PEDs) are critical security components in EMV smartcard payment systems as they receive a customer's card and PIN. Their approval is subject to an extensive suite of evaluation and certification procedures. In this paper, we demonstrate that the tamper proofing of PEDs is unsatisfactory, as is the certification process. We have implemented practical low-cost attacks on two certified, widely-deployed PEDs -- the Ingenico i3300 and the Dione Xtreme. By tapping inadequately protected smartcard communications, an attacker with basic technical skills can expose card details and PINs, leaving cardholders open to fraud. We analyze the anti-tampering mechanisms of the two PEDs and show that, while the specific protection measures mostly work as intended, critical vulnerabilities arise because of the poor integration of cryptographic, physical and procedural protection. As these vulnerabilities illustrate a systematic failure in the design process, we propose a methodology for doing it better in the future. These failures also demonstrate a serious problem with the Common Criteria. So we discuss the incentive structures of the certification process, and show how they can lead to problems of the kind we identified. Finally, we recommend changes to the Common Criteria framework in light of the lessons learned.


Date:

02nd of December 2009


Original Paper:

ieeexplore.ieee....


The role of architecture in solving problems

based on The end of an architectural era: (it's time for a complete rewrite) by Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, Pat Helland

Benjamin Heitmann

Abstract

In previous papers [SC05, SBC+07], some of us predicted the end of "one size fits all" as a commercial relational DBMS paradigm. These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1--2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific database markets.
Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required. In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well. The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T. to a popular RDBMS on the standard transactional benchmark, TPC-C.

We conclude that the current RDBMS code lines, while attempting to be a "one size fits all" solution, in fact, excel at nothing. Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of "from scratch" specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow's requirements, not continue to push code lines and architectures designed for yesterday's needs.


Date:

01st of December 2009


Original Paper:

portal.acm.org/c...


Designing experiments and performing statistical analysis using ANOVA - Introduction

based on Comparing the effects of text size and format on the readability of computer-displayed Times New Roman and Arial text by Bernard, M.L., Chaparro, B.S., Mills, M.M., Halcomb, C.G.

Jacek Jankowski

Abstract

Jacek will make a short and simple tutorial on how to design an experiment and how to analyze evaluation results with analysis of variance (ANOVA). He will also show how to evaluate pairwise comparisons using Bonferroni procedure. Presentation will be based on the evaluation described in the related paper.



Date:

25th of November 2009


Original Paper:

portal.acm.org/c...


Evaluation in Human Computer Interaction

based on 2 papers by various authors

VinhTuan Thai

Abstract

The talk will be based on two papers from the CHI conference series:

Greenberg, S. and Buxton, B. Usability evaluation considered harmful (some of the time). In Proceeding of CHI '08.

Abstract:
Current practice in Human Computer Interaction as encouraged by educational institutes, academic review processes, and institutions with usability groups advocate usability evaluation as a critical part of every design process. This is for good reason: usability evaluation has a significant role to play when conditions warrant it. Yet evaluation can be ineffective and even harmful if naively done 'by rule' rather than 'by thought'. If done during early stage design, it can mute creative ideas that do not conform to current interface norms. If done to test radical innovations, the many interface issues that would likely arise from an immature technology can quash what could have been an inspired vision. If done to validate an academic prototype, it may incorrectly suggest a design's scientific worthiness rather than offer a meaningful critique of how it would be adopted and used in everyday practice. If done without regard to how cultures adopt technology over time, then today's reluctant reactions by users will forestall tomorrow's eager acceptance. The choice of evaluation methodology - if any - must arise from and be appropriate for the actual problem or research question under consideration.

Barkhuus, L. and Rhode, J.A., From Mice to Men - 24 years of Evaluation in CHI. In Extended Proceedings of CHI '07

Abstract:
This paper analyzes trends in the approach to evaluation taken by CHI papers in the last 24 years. A set of papers was analyzed according to our schema for classifying type of evaluation. Our analysis traces papers’ trend in type and scope of evaluation. Findings include an increase in the proportion of papers that include evaluation, and a decrease in the median number of subjects in quantitative studies. We also critique the types of subjects, in particular an over reliance on students, and lack of appropriately gender balanced samples. We contextualize these findings in historical trends as we move from machines intended for the technical elite in laboratories to computers integrated into the daily life of everyone.



Date:

11th of November 2009


To Buy or Not to Buy: Mining Airfare Data to Minimize Ticket Purchase Price

based on To Buy or Not to Buy: Mining Airfare Data to Minimize Ticket Purchase Price by Oren Etzioni, Rattapoom Tuchinda, Craig A. Knoblock, Alexander Yates

Maciej Zaremba

Abstract

As product prices become increasingly available on the World Wide Web, consumers attempt to understand how corporations vary these prices over time. However, corporations change prices based on proprietary algorithms and hidden variables (e.g., the number of unsold seats on a fight). Is it possible to develop data mining techniques that will enable consumers to predict price changes under these conditions? This paper reports on a pilot study in the domain of airline ticket prices where we recorded over 12,000 price observations over a 41 day period. When trained on this data, Hamlet - our multi-strategy data mining algorithm - generated a predictive model that saved 341 simulated passengers $198,074 by advising them when to buy and when to postpone ticket purchases. Remarkably, a clairvoyant algorithm with complete knowledge of future prices could save at most $320,572 in our simulation, thus Hamlet's savings were 61.8% of optimal. The algorithm's savings of $198,074 represents an average savings of 23.8% for the 341 passengers for whom savings are possible. Overall, Hamlet saved 4.4% of the ticket price averaged over the entire set of 4,488 simulated passengers. Our pilot study suggests that mining of price data available over the web has the potential to save consumers substantial sums of money per annum.

This paper is an excellent example of how interesting research can be turned into practical and valuable business. The research paper has been published in 2003, in the same year Oren Etizioni established airline fare prediction company called Hamlet, later on rebranded to Farecast. Over the years price prediction methods offered by Farecast became more mature and number of different US routes and airlines has been supported. Farecast was acquired for $115 million by Microsoft in 2008 and airline ticket price prediction is part of the Bing travel [2]. Price prediction functionality is currently in beta version and it is planned to support many more international routes. Online price prediction proposed by Etizioni et al. is protected by the international patent filed in 2006 [3].

[1] http://portal.acm.org/citation.cfm?doid=956750.956767
[2] http://www.bing.com/travel/about/howAirPredictions.do
[3] http://www.freepatentsonline.com/7010494.pdf


Date:

04th of November 2009


Original Paper:

portal.acm.org/c...


Data Integration in Mashups

based on Data Integration in Mashups by Giusy Di Lorenzo, Hakim Hacid, Hye-young Paik, Boualem Benatallah

Wassim Derguech

Abstract

Mashup is a new application development approach that allows users to aggregate multiple services to create a service for a new purpose. Even if the Mashup approach opens new and broader opportunities for data/service consumers, the development process still requires the users to know not only how to write code using programming languages, but also how to use the different Web APIs from different services. In order to solve this problem, there is increasing effort put into developing tools which are designed to support users with little programming knowledge in Mashup applications development. The objective of this study is to analyze the richnesses and weaknesses of the Mashup tools with respect to the data integration aspect.


Date:

21st of October 2009


Original Paper:

portal.acm.org/c...


Evaluation of Software Systems

based on Evaluation of Software Systems by Günther Gediga, Kai-Christoph Hamborg

Simon Scerri

Abstract

Evaluation as an aid for software development has been applied since the last decade, when the comprehension of the role of evaluation within Human-Computer Interaction had changed. Software can be evaluated with respect to different aspects, for example, functionality, reliability, usability, efficiency, maintainability and portability.
In this survey, we concentrate on the aspect of usability from an ergonomic point of view. This aspect has gained particular importance during the last decade with the increasing use of interactive software. Whereas in earlier times evaluation of software took place at the end of the developing phase, using experimental designs and statistical analysis, evaluation is nowadays used as a tool for information gathering within iterative design.
Within this context, instruments for evaluation are not primarily used for global evaluation of an accomplished product, but these instruments are applied during the development of a product. Indeed, most experts agree nowadays that the development of usable software can only be done by a systematic consideration of usability aspects within the life-cycle model. One prominent part is the evaluation of prototypes with respect to usability aspects, employing suitable evaluation techniques in order to find usability errors and weaknesses of the software at an early stage.


Date:

07th of October 2009


Material (Slides):

fileadmin/docume...


Representing, Querying and Transforming Social Networks with RDF/SPARQL

based on Representing, Querying and Transforming Social Networks with RDF/SPARQL by Mauro San Martín, Claudio Gutierrez

Sheila Kinsella

Abstract

As social networks are becoming ubiquitous on the Web, the Semantic Web goals indicate that it is critical to have a standard model allowing exchange, interoperability, transformation, and querying of social network data. In this paper we show that RDF/SPARQL meet this desiderata. Building on developments of social network analysis, graph databases and Semantic Web, we present a social networks data model based on RDF, and a query and transformation language based on SPARQL meeting the above requirements. We study its expressive power and complexity showing that it behaves well, and present an illustrative prototype.


Date:

30th of September 2009


Original Paper:

www.dcc.uchile.c...


Opinion Mining and Sentiment Analysis - Two relevant directions

based on Various papers by various authors

Tudor Groza

Abstract

Argumentative discussions can be found in a variety of domains from traditional scientific publishing to today's modern social software. An interactive argumentative discussion usually consists of an initial proposition stated by a single creator, followed by supporting propositions or counter-propositions from other contributors. These propositions intrinsically externalize the contributors' sentiments and opinions, with the current Web 2.0 technologies enabling more and more their free expression. Consequently, this transforms the Web into an extremely valuable source for mining people's positions and sentiments. Lately, the topic of opinion mining and sentiment analysis gained a lot of interest and several research directions have been followed. In this presentation I will focus, in particular, on two such directions: document level and sentence level opinion mining and sentiment analysis.


Based on the following collection of papers:

(Liu 2007): Bing Liu, Web Data Mining - Exploring Hyperlinks, Contents and Usage Data, Springer, December 2006

(Turney, 2002): Peter D. Turney - Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, ACL 2002

(Pang et al., 2002): Bo Pang, Lillian Lee and Shivakumar Vaithyanathan - Thumbs up? Sentiment Classification using Machine Learning Techniques, EMNLP 2002

(Dave et al., 2003): Kushal Dave, Steve Lawrence, David M. Pennock - Mining the peanut gallery: opinion extraction and semantic classification of product reviews, WWW 2003

(Mullen and Collier, 2004): Tony Mullen and Nigel Collier - Sentiment Analysis using Support Vector Machines with Diverse Information Sources, EMNLP 2004

(Pang and Lee, 2005): Bo Pang and Lillian Lee - Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, ACL 2005

(Chaovalit and Zhou, 2005): Pimwadee Chaovalit, Lina Zhou - Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches, HICSS 2005

(Wiebe et al, 1999): Janyce Wiebe, Rebecca F. Bruce, Thomas P. O'Hara - Development and Use of a Gold-Standard Data Set for Subjectivity Classifications.

(Yu and Hazivassiloglou, 2003): Hong Yu and Vasileios Hatzivassiloglou - Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences, EMNLP 2003

(Rilloff and Wiebe, 2003): Ellen Riloff and Janyce Wiebe - Learning Extraction Patterns for Subjective Expressions, EMNLP 2003

(Hatzivassiloglou and Wiebe, 2000): Vasileios Hatzivassiloglou and Janyce Wiebe - Effects of Adjective Orientation and Gradability on Sentence Subjectivity, Coling 2000

(Wiebe and Rilloff, 2005): Janyce Wiebe and Ellen Riloff - Creating Subjective and Objective Sentence Classifiers from Unannotated Texts, CICLing 2005

(Kim and Hovy, 2004): Soo-Min Kim and Eduard Hovy - Determining the sentiment of opinions, COLING 2004



Date:

23rd of September 2009


Tesseract: Interactive Visual Exploration of Socio-Technical Relationships in Software Development

based on Tesseract: Interactive Visual Exploration of Socio-Technical Relationships in Software Development by Anita Sarma, Larry Maccherone, Patrick Wagstrom,, Jim Herbsleb

Aftab Iqbal

Abstract

Software developers have long known that project success requires a robust understanding of both technical and social linkages. However, research has largely considered these independently. Research on networks of technical artifacts focuses on techniques like code analysis or mining project archives. Social network analysis has been used to capture information about relations among people. Yet, each type of information is often far more useful when combined, as when the “goodness” of social networks is judged by the patterns of dependencies in the technical artifacts. To bring such information together, we have developed Tesseract, a socio-technical dependency browser that utilizes cross-linked displays to enable exploration of the myriad relationships between artifacts, developers, bugs, and communications. We evaluated Tesseract by (1) demonstrating its feasibility with GNOME project data (2) assessing its usability via informal user evaluations, and (3) verifying its suitability for the open source community via semi-structured interviews.



Date:

16th of September 2009


Original Paper:

portal.acm.org/c...


Optimizing Peer Relationships in a Super-Peer Network

based on Optimizing Peer Relationships in a Super-Peer Network by Pawe Garbacki, Dick H. J Epema, Maarten Van Steen

Sanaullah Nazir

Abstract

Super-peer architectures exploit the heterogeneity of nodes in a P2P network by assigning additional responsi- bilities to higher-capacity nodes. In the design of a super- peer network for file sharing, several issues have to be addressed: how client peers are related to super-peers, how super-peers locate files, how the load is balanced among the super-peers, and how the system deals with node failures. In this paper we introduce a self-organizing super-peer network architecture (SOSPNET) that solves these issues in a fully decentralized manner. SOSPNET maintains a super- peer network topology that reflects the semantic similarity of peers sharing content interests. Super-peers maintain semantic caches of pointers to files which are requested by peers with similar interests. Client peers, on the other hand, dynamically select super-peers offering the best search performance. We show how this simple approach can be em- ployed not only to optimize searching, but also to solve generally difficult problems encountered in P2P architectures such as load balancing and fault tolerance. We evaluate SOSPNET using a model of the semantic structure derived from the 8-month traces of two large file-sharing communi- ties. The obtained results indicate that SOSPNET achieves close-to-optimal file search performance, quickly adjusts to changes in the environment (node joins and leaves), sur- vives even catastrophic node failures, and efficiently distributes the system load taking into account peer capacities.



Date:

09th of September 2009


Case-based reasoning - short introduction

based on Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches by A. Aamodt, E. Plaza

Adam Gzella

Abstract

Case-based reasoning is a recent approach to problem solving and learning that has got a lot of attention over the last few years. Originating in the US, the basic idea and underlying theories have spread to other continents, and we are now within a period of highly active research in case-based reasoning in Europe, as well. This paper gives an overview of the foundational issues related to case-based reasoning, describes some of the leading methodological approaches within the field, and exemplifies the current state through pointers to some systems. Initially, a general framework is defined, to which the subsequent descriptions and discussions will refer. The framework is influenced by recent methodologies for knowledge level descriptions of intelligent systems. The methods for case retrieval, reuse, solution testing, and learning are summarized, and their actual realization is discussed in the light of a few example systems that represent different CBR approaches. We also discuss the role of case-based methods as one type of reasoning and learning method within an integrated system architecture.


Date:

02nd of September 2009


Original Paper:

www2.iiia.csic.e...


An overview of RDB2RDF techniques and tools

based on several papers by various authors

Nuno Lopes

Abstract

Many science archive centres publish very large volumes of image, simulation, and experiment data. In order to integrate and analyse the available data, scientists need to be able to (i) identify and locate all the data relevant to their work; (ii) understand the multiple heterogeneous data models in which the data is published; and (iii) interpret and process the data they retrieve. RDF has been shown to be a generally successful framework within which to perform such data integration work. It can be equally successful in the context of scientific data, if it is demonstrably practical to expose that data as RDF. In this paper we investigate the capabilities of RDF to enable the integration of scientific data sources. Specifically, we discuss the suitability of sparql for expressing scientific queries, and the performance of several triple stores and RDB2RDF tools for executing queries over a moderately sized sample of a large astronomical data set. We found that more research and improvements are required into SPARQL and RDB2RDF tools to efficiently expose existing science archives for data integration.

Based on:

A. J. G. Gray, N. Gray, and I. Ounis. Can RDB2RDF Tools Feasibly Expose Large Science Archives for Data Integration? In L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyvönen, R. Mizoguchi, E. Oren, M. Sabou, and E. P. B. Simperl, editors, ESWC, volume 5554 of Lecture Notes in Computer Science, pages 491–505. Springer, 2009.

A. Malhotra. W3C RDB2RDF Incubator Group Report. http://www.w3.org/2005/Incubator/rdb2rdf/XGR-rdb2rdf/, January 2009.

S. S. Sahoo, W. Halb, S. Hellmann, K. Idehen, T. T. Jr, S. Auer, J. Sequeda, and A. Ezzat. A Survey of Current Approaches for Mapping of Relational Databases to RDF. W3C RDB2RDF XG Report, W3C, 2009.



Date:

26th of August 2009


Original Paper:

www.w3.org/2005/...


Bokode: Imperceptible Visual tags for Camera Based Interaction from a Distance

based on Bokode: Imperceptible Visual tags for Camera Based Interaction from a Distance by Ankit Mohan, Grace Woo, Shinsaku Hiura, Quinn Smithwick, Ramesh Raskar

Gearoid Hynes

Abstract

We show a new camera based interaction solution where an ordinary camera can detect small optical tags from a relatively large distance. Current optical tags, such as barcodes, must be read within a short range and the codes occupy valuable physical space on products. We present a new low-cost optical design so that the tags can be shrunk to 3mm visible diameter, and unmodified ordinary cameras several meters away can be set up to decode the identity plus the relative distance and angle. The design exploits the bokeh effect of ordinary cameras lenses, which maps rays exiting from an out of focus scene point into a disk like blur on the camera sensor. This bokeh-code or Bokode is a barcode design with a simple lenslet over the pattern. We show that a code with 15μm features can be read using an off-the-shelf camera from distances of up to 2 meters. We use intelligent binary coding to estimate the relative distance and angle to the camera, and show potential for applications in augmented reality and motion capture. We analyze the constraints and performance of the optical system, and discuss several plausible application scenarios.


Date:

19th of August 2009


Original Paper:

delivery.acm.org...


RDF-3X: a RISC-style engine for RDF

based on RDF-3X: a RISC-style engine for RDF by Thomas Neumann, Gerhard Weikum

Aidan Hogan

Abstract

RDF is a data representation format for schema-free structured information that is gaining momentum in the context of Semantic-Web corpora, life sciences, and also Web 2.0 platforms. The "pay-as-you-go" nature of RDF and the flexible pattern-matching capabilities of its query language SPARQL entail efficiency and scalability challenges for complex queries including long join paths. This paper presents the RDF-3X engine, an implementation of SPARQL that achieves excellent performance by pursuing a RISC-style architecture with a streamlined architecture and carefully designed, puristic data structures and operations. The salient points of RDF-3X are: 1) a generic solution for storing and indexing RDF triples that completely eliminates the need for physical-design tuning, 2) a powerful yet simple query processor that leverages fast merge joins to the largest possible extent, and 3) a query optimizer for choosing optimal join orders using a cost model based on statistical synopses for entire join paths. The performance of RDF-3X, in comparison to the previously best state-of-the-art systems, has been measured on several large-scale datasets with more than 50 million RDF triples and benchmark queries that include pattern matching and long join paths in the underlying data graphs.



Date:

12th of August 2009


Material (Slides):

sw.deri.org/~aid...


Original Paper:

www.vldb.org/pvl...


Inferences under time pressure: how opportunity costs affect strategy selection

based on Inferences under time pressure: how opportunity costs affect strategy selection by Rieskamp, J., Hoffrage, U.

Anna Dabrowska

Abstract

Do the inference strategies people select depend on the magnitude of time pressure? Is this dependency modified by the type of time pressure? These questions are addressed in three experimental studies in which participants made inferences after having searched for information on a computerized information board. In Study 1, time pressure was induced indirectly by imposing opportunity costs of being slow, a form of time pressure that is common in daily life but that has rarely been examined in the literature. A simple lexicographic heuristic (LEX) achieved the best fit in predicting participants' inferences. Studies 2 and 3 induced high time pressure either indirectly by imposing opportunity costs in terms of time or directly by limiting the time for each choice. Regardless of how time pressure was induced, under high time pressure the inferences could be best predicted with LEX, whereas under low time pressure a weighted linear model that integrates all available information predicted the inferences best. We conclude that people select strategies adaptively depending on characteristics of the situation.


Date:

05th of August 2009


Original Paper:

www.sciencedirec...


Human computation and the ESP Game story

Ilko Grigorov

Abstract

Based on three papers.

“We introduce a new interactive system: a game that is fun and can be used to create valuable output. When people play the game they help determine the contents of images by providing meaningful labels for them.” [1]

“In this paper, we propose a competitive human computation game, KissKissBan (KKB), for image annotation. KKB is different from other human computation games since it integrates both collaborative and competitive elements in the game design.” [2]

“The ESP Game was designed to harvest human intelligence to assign labels to images - a task which is still difficult for even the most advanced systems in image processing. However, the ESP Game as it is currently implemented encourages players to assign “obvious” labels, which are most likely to lead to an agreement with the partner.” “We present a language model which, given enough instances of labeled images as training data, can assign probabilities to the next label to be added. This model is then used in a program, which plays the ESP game without looking at the image. Even without any understanding of the actual image, the program manages to agree with the randomly assigned human partner on a label for 69% of all images, and for 81% of images which have at least one “off-limits” term assigned to them.” [3]


[1]. Luis von Ahn and Laura Dabbish (Carnegie Mellon University, USA), Labeling images with a computer game. In Conference on Human Factors in Computing Systems (CHI’04), pages 319–326, 2004.

[2]. Chien-Ju Ho, Tao-Hsuan Chang, Jong-Chuan Lee, Jane Yung-Jen Hsu, Kuan-Ta Chen, KissKissBan: A Competitive Human Computation Game for Image Annotation, In Human Computation Workshop KDD2009

[3]. Ingmar Weber (EPFL - Lausanne, Switzerland), Stephen Robertsonand Milan Vojnovic (Microsoft Research, Cambridge UK), Rethinking the ESP Game, Technical Report MSR-TR-2008-132



Date:

29th of July 2009


Semantic Grounding of Tag Relatedness in Social Bookmarking Systems by Smitashree Choudhury

based on Semantic Grounding of Tag Relatedness in Social Bookmarking Systems by Smitashree Choudhury by Ciro Cattuto, Dominik Benz, Andreas Hotho,, Gerd Stumme

Smitashree Choudhury

Abstract

Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For tasks like synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Eventhough most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptionson the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity interms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures oftag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding isprovided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measuresof semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of theinvestigated similarity measures and indicates which ones are better suited in the context of a given semantic application.


Date:

15th of July 2009


Material (Slides):

www.deri.ie/file...


Original Paper:

isiosf.isi.it/~c...


Fresnel: a Browser-Independent Presentation Vocabulary for RDF

based on Fresnel: a Browser-Independent Presentation Vocabulary for RDF by Christian Bizer, Ryan Lee, Emmanuel Pietriga

Stéphane Corlosquet

Abstract

SemanticWeb browsers and other tools aimed at displaying RDF data to end users are all concerned with the same problem: presenting content primarily intended for machine consumption in a human-readable way. Their solutions differ but in the end address the same two high-level issues, no matter the underlying representation paradigm: specifying (i) what information contained in RDF models should be presented (content selection) and (ii) how this information should be presented (content formatting and styling). However, each tool currently relies on its own ad hoc mechanisms and vocabulary for specifying RDF presentation knowledge, making it difficult to share and reuse such knowledge across applications. Recognizing the general need for presenting RDF content to users and wanting to promote the exchange of presentation knowledge, we designed Fresnel as a browser-independent vocabulary of core RDF display concepts. In this paper we describe Fresnel’s main concepts and present several RDF browsers and visualization tools that have adopted the vocabulary so far.


Date:

24th of June 2009


Original Paper:

iswc2006.semanti...


A survey on wireless multimedia sensor networks

based on A survey on wireless multimedia sensor networks by Ian F. Akyildiz, Tommaso Melodia, Kaushik R. Chowdhury

Lei Shu

Abstract

The availability of low-cost hardware such as CMOS cameras and microphones has fostered the development of Wireless Multimedia Sensor Networks (WMSNs), i.e., networks of wirelessly interconnected devices that are able to ubiquitously retrieve multimedia content such as video and audio streams, still images, and scalar sensor data from the environment. In this paper, the state of the art in algorithms, protocols, and hardware for wireless multimedia sensor networks is surveyed, and open research issues are discussed in detail. Architectures for WMSNs are explored, along with their advantages and drawbacks. Currently off-the-shelf hardware as well as available research prototypes for WMSNs are listed and classified. Existing solutions and open research issues at the application, transport, network, link, and physical layers of the communication protocol stack are investigated, along with possible cross-layer synergies and optimizations.


Date:

17th of June 2009


Original Paper:

www.eng.buffalo....


IRLbot: scaling to 6 billion pages and beyond

based on IRLbot: scaling to 6 billion pages and beyond by Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmitri Loguinov

Jürgen Umbrich

Abstract

This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with the quadratically increasing complexity of verifying URL uniqueness, BFS crawl order, and fixed per-host rate-limiting, current crawling algorithms cannot effectively cope with the sheer volume of URLs generated in large crawls, highly-branching spam, legitimate multi-million-page blog sites, and infinite loops created by server-side scripts. We offer a set of techniques for dealing with these issues and test their performance in an implementation we call IRLbot. In our recent experiment that lasted 41 days, IRLbot running on a single server successfully crawled 6.3 billion valid HTML pages ($7.6$ billion connection requests) and sustained an average download rate of 319 mb/s (1,789 pages/s). Unlike our prior experiments with algorithms proposed in related work, this version of IRLbot did not experience any bottlenecks and successfully handled content from over 117 million hosts, parsed out 394 billion links, and discovered a subset of the web graph with 41 billion unique nodes.


Date:

10th of June 2009


Material (Slides):

umbrich.net/irlb...


Original Paper:

www2008.org/pape...


Generality in AI (and its Relation to the Semantic Web)

based on Several papers by Various authors

Vit Novacek

Abstract

Vit will present selected parts from the research and visions of six seminal thinkers active in fields ranging from AI and Semantic Web through cognitive science to psychology and economics. In particular, the talk will be based on four papers, which may seem to be dealing with rather disparate topics, however, they all are related in one way or another to generality, a fundamental feature assumed to be inherent to any truly intelligent system. First Vit will recall the McCarthy's take on generality in AI based on his ACM Turing Award lecture article [1]. Then he will present a sort of alternative and competing position, as given in Hofstadter's Analogical Mind book chapter [2]. He will show that Hofstadter's heretic non-formalist visions might have had at least a partial rigorous support since already long time ago according to the Tversky and Kahneman's Science article on judgment under uncertainty [3]. Following the cognitive science detour, he will get back to more familiar grounds and provide a summary of the major Semantic Web modelling paradigms based on Patel-Schneider and Horrocks' position paper from WWW 2006 [4]. After presenting the core content of the talk in as objective way as possible, he will provide a slightly personal synthesis and ask few questions we should try to answer in case he is right.

[1] John McCarthy. Generality in Artificial Intelligence. Communications
of ACM, 30 (12). ACM Press, 1987.
URL: http://www-formal.stanford.edu/jmc/generality.pdf

[2] Douglas R. Hofstadter. Analogy as the Core of Cognition. The Analogical
Mind: Perspectives from Cognitive Science}, Dedre Gentner, Keith J. Holyoak, and
 Boicho N. Kokinov (eds.). Cambridge MA: The MIT Press/Bradford Book, 2001.
URL: http://prelectur.stanford.edu/lecturers/hofstadter/analogy.html

[3] Amos Tversky, Daniel Kahneman. Judgment under Uncertainty:
Heuristics and Biases. Science, 185 (4157). AAAS, 1974.
URL: http://psiexp.ss.uci.edu/research/teaching/Tversky_Kahneman_1974.pdf

[4] Peter F. Patel-Schneider, Ian Horrocks. Position Paper: A Comparison
of Two Modelling Paradigms in the Semantic Web. In Proceedings of
WWW'06. ACM Press, 2006.
URL: http://portal.acm.org/citation.cfm?doid=1135777.1135784


Date:

27th of May 2009


Original Paper:

140.203.154.209/...


Presentation of POWDER

Jürgen Umbrich, Michael Hausenblas


Date:

20th of May 2009


Material (Slides):

fileadmin/docume...


Original Paper:

www.w3.org/2007/...


Personal Information Management by Note-taking

based on Information scraps: How and why information eludes our personal information management tools by Michael S. Bernstein, Max Van Kleek, David R. Karger, M. C. Schraefel

Laura Dragan


Date:

20th of May 2009


Perceptually grounded meaning creation

based on Perceptually grounded meaning creation by Luc Steels

Antonio Aguilare

Abstract

The paper proposes a mechanism for the spontaneous formation of perceptually grounded meanings under the selectionist pressure of a discrimination task. The mechanism is defined formally and the results of some simulation experiments are reported.

Keywords: origins of meanings, self-organization, distributed agents, open systems.


Date:

13th of May 2009


Material (Slides):

www.antonio-agui...


Original Paper:

Perceptually gro...


Consumer decision making in online shopping environments: The effects of interactive decision aids

based on Consumer decision making in online shopping environments: The effects of interactive decision aids by Häubl, G., Trifts, V.

Maciej Dabrowski

Abstract

Despite the explosive growth of electronic commerce and the rapidly increasing number of consumers who use interactive media (such as the World Wide Web) for prepurchase information search and online shopping, very little is known about how consumers make purchase decisions in such settings. The availability of interactive decision tools for consumers may lead to a transformation of the way in which shoppers search for product information and make purchase decisions. The primary objective of this paper is to investigate the nature of the effects that interactive decision aids may have on consumer decision making in online shopping environments. This paper examines the effects of two decision aids on purchase decision making in an online store. The first interactive tool, a recommendation agent (RA), allows consumers to more efficiently screen the (potentially very large) set of alternatives available in an online shopping environment. Based on self-explicated information about a consumer's own utility function (attribute importance weights and minimum acceptable attribute levels), the RA generates a personalized list of recommended alternatives. The second decision aid, a comparison matrix (CM), is designed to help consumers make in-depth comparisons among selected alternatives.
Based on theoretical and empirical work in marketing, judgment and decision making, psychology, and decision support systems, we develop a set of hypotheses pertaining to the effects of these two decision aids on various aspects of consumer decision making In particular, we focus on how use of the RA and CM affects consumers' search for product information, the size and quality of their consideration sets, and the quality of their purchase decisions in an online shopping environment. In sum, our findings suggest that interactive tools designed to assist consumers in the initial screening of available alternatives and to facilitate in-depth comparisons among selected alternatives in an online shopping environment may have strong favorable effects on both the quality and the efficiency of purchase decisions.


Date:

23rd of April 2009


Original Paper:

www.business.ual...


OWL2: New Features of the Web Ontology Language

based on OWL 2 Web Ontology Language by Various authors

Ratnesh Sahay

Abstract

The Web Ontology Language OWL has been a W3C recommendation since January 2004. However, there have been many requests from both practitioners and theoretician to improve and enhance the features of this language. This led to a reconsideration of the specification which is currently being worked on. The result of this work will be a new recommendation for the second version of the Web Ontology Language, called OWL 2. The specifications are almost in their final state and will be officially recommended by the end of 2009.



Date:

22nd of April 2009


Original Paper:

www.w3.org/2007/...


Hybrid NLG in a Generic Dialog System

based on Hybrid NLG in a Generic Dialog System by Martin Klarner

Brian Davis

Abstract

Natural Language Generation (NLG) systems are increasingly becoming available as "market-ready" products, mainly due to the now-removed boundary between shallow and deep generation and the emergence of hybrid systems as a de-facto standard. In this paper, we present Hyperbug, a novel approach towards hybrid NLG, coupling shallow and deep processing not only with respect to the resources used for parsing and generation, but also on the architectural level to increase the generative power of the shallow generation branch and the processing efficiency of the whole generation system. The architecture is discussed both in theory and in practice, using a comprehensive example spanning the complete output part of our dialog system.


Date:

08th of April 2009


Introduction to Description Logics

Antoine Zimmermann

Abstract

Description Logics (DLs) are considered one of the most important knowledge representation formalisms. Their flexibility and modularity, together with the amount of theoretical results and practical work on them, have made DLs the formalism of choice for the Web Ontology Language (OWL), which is used to provide meaning to Linked Data on the Semantic Web.


Date:

01st of April 2009


Flickr Tag Recommendation based on Collective Knowledge

based on Flickr Tag Recommendation based on Collective Knowledge by Börkur Sigurbjörnsson, Roelof van Zwol

Peyman Nasirifard

Abstract

Online photo services such as Flickr and Zooomr allow users to share their photos with family, friends, and the online community at large. An important facet of these services is that users manually annotate their photos using so called tags, which describe the contents of the photo or provide additional contextual and semantical information. In this paper we investigate how we can assist users in the tagging phase. The contribution of our research is twofold. We analyse a representative snapshot of Flickr and present the results by means of a tag characterisation focusing on how users tags photos and what information is contained in the tagging. Based on this analysis, we present and evaluate tag recommendation strategies to support the user in the photo annotation task by recommending a set of tags that can be added to the photo. The results of the empirical evaluation show that we can effectively recommend relevant tags for a variety of photos with different levels of exhaustiveness of original tagging.


Date:

25th of February 2009


Original Paper:

www2008.org/pape...


L2R: A Logical method for Reference Reconciliation

based on L2R: A Logical method for Reference Reconciliation by Fatiha Saïs, Nathalie Pernelle, Marie-Christine Rousset

Renaud Delbru

Abstract

The reference reconciliation problem consists in deciding whether different identifiers refer to the same data, i.e., correspond to the same world entity. The L2R system exploits the semantics of a rich data model, which extends RDFS by a fragment of OWL-DL and SWRL rules. In L2R, the semantics of the schema is translated into a set of logical rules of reconciliation, which are then used to infer correct decisions both of reconciliation and no reconciliation. In contrast with other approaches, the L2R method has a precision of 100% by construction. First experiments show promising results for recall, and most importantly significant increases when rules are added.



Date:

18th of February 2009


Original Paper:

www-lsr.imag.fr/...


Mixed Initiative Reasoning

Krystian Samp

Abstract

based on papers:
[1] J. F. Allen, Mixed-Initiative Interaction, IEEE Intelligent Systems, 1999
[2] E. Horvitz, Uncertainty, action, and interaction: in pursuit of mixed-initiative computing, IEEE Intelligent Systems, 1999
[3] C. I. Guinn, Evaluating mixed-initiative dialog, IEEE Intelligent Systems, 1999
[4] E. Horvitz, Principles of Mixed-Initiative User Interfaces, CHI 1999
[5] G. Tecuci, M. Boicu, M. T. Cox, Seven Aspects of Mixed-Initiative Reasoning, AAAI special issue on mixed-initiative assistants, 2007



Date:

11th of February 2009


Challenges on Distributed Web Retrieval

based on Challenges on Distributed Web Retrieval by R. Baeza-Yates, C. Castillo, F. Junqueira, V. Plachouras, F. Silvestri

Michele Catasta


Date:

17th of December 2008


Original Paper:

ieeexplore.ieee....


RDFa

based on RDFa W3C Recommendation by W3C

Benjamin Heitmann


Date:

03rd of December 2008


Original Paper:

www.w3.org/TR/rd...


SBVR Use cases

based on SBVR Use cases by Mark H. Linehan

Raluca Zaharia


Date:

19th of November 2008


Original Paper:

www.springerlink...


Index structures and algorithms for querying distributed RDF repositories

Brahmananda Sapkota

Abstract

based on the paper: "Index structures and algorithms for querying distributed RDF repositories". Proceedings of the 13th international conference on World Wide Web (WWW2004), 2004, by Stuckenschmidt, H., Vdovjak, R., Houben, G.J., Broekstra, J. http://portal.acm.org/citation.cfm?id=988672.988758


Date:

05th of November 2008


Index structures and algorithms for querying distributed RDF

based on Index structures and algorithms for querying distributed RDF by Heiner Stuckenschmidt, Richard Vdovjak, GeertJan, Jeen Broekstra

Brahmananda Sapkota


Date:

05th of November 2008


Original Paper:

portal.acm.org/f...


Extended Workflow Patterns

based on Workflow Control-Flow Patterns: A Revised View by N. Russell, A.H.M. ter Hofstede, W.M.P. van der Aalst, N. Mulyar

Armin Haller


Date:

08th of October 2008


Original Paper:

www.workflowpatt...


Categorization and Optimization of Synchronization Dependencies in Business Processes

based on Categorization and Optimization of Synchronization Dependencies in Business Processes by Quini Wu, Calton Pu, Akhil Sahai, Roger Barga

Gabriela Vulcu


Date:

17th of September 2008


Material (Slides):

ieeexplore.ieee....


Supporting the dynamic evolution of Web service protocols in service-oriented architectures

Maria Jose

Abstract

This presentation is based on the paper "Supporting the dynamic evolution of Web service protocols in service-oriented architectures". ACM Transactions on the Web (TWEB), Volume 2 , Issue 2 (April 2008) by Seung Hwan Ryu, Fabio Casati, Halvard Skogsrud, Boualem Benatallah, and Régis Saint-Paul.


Date:

10th of September 2008


Material (Slides):

portal.acm.org/c...


RESTful Web Services vs. ``Big'' Web Services: Making the Right Architectural Decision.

Maciej Zaremba

Abstract

based on the paper: Cesare Pautasso, Olaf Zimmermann, Frank Leymann. RESTful Web Services vs. ``Big'' Web Services: Making the Right Architectural Decision. WWW 2008.


Date:

27th of August 2008


Material (Slides):

www2008.org/pape...


DECLARE: Full Support for Loosely-Structured Processes

based on DECLARE: Full Support for Loosely-Structured Processes by Maja Pesic, Helen Schonenberg, Wil M.P. van der Aalst

ZhangBing Zhou


Date:

13th of August 2008


Original Paper:

is.tm.tue.nl/res...


Wishful Search: Interactive Composition of Data Mashups

based on Wishful Search: Interactive Composition of Data Mashups by Anton V. Riabov, Eric Bouillet, Mark D. Feblowitz, Zhen Liu, Anand Ranganathan

Sami Bhiri


Date:

30th of July 2008


Original Paper:

www2008.org/pape...


SA-REST and (S)mashups: Adding Semantics to RESTful Services

based on SA-REST and (S)mashups: Adding Semantics to RESTful Services by Amit Sheth, Karthik Gomadam, Jon Lathem

Nikos Loutas


Date:

09th of July 2008


Original Paper:

ieeexplore.ieee....


Process and Pitfalls in Writing Information Visualization Research Papers

based on Process and Pitfalls in Writing Information Visualization Research Papers by Tamara Munzner

Cosmin Basca


Date:

30th of April 2008


Original Paper:

www.cs.ubc.ca/la...


The Symbol Grounding Problem Has Been Solved. So What's next?

based on The Symbol Grounding Problem Has Been Solved. So What's next? by L. Steels

Antonio Aguilar


Date:

23rd of April 2008


Original Paper:

www.csl.sony.fr/...


Benchmarking Database Representations of RDF/S Stores

based on Benchmarking Database Representations of RDF/S Stores by Y. Theoharis, V. Christophides, G. Karvounarakis

Stéphane Corlosquet


Date:

16th of April 2008


Original Paper:

www.springerlink...


Empowering Software Maintainers with Semantic Web Technologies

based on Empowering Software Maintainers with Semantic Web Technologies by René Witte, Yonggang Zhang, Juergen Rilling

Tudor Groza


Date:

26th of March 2008


Original Paper:

www.eswc2007.org...


Efficient Search in Large Textual Collections with Redundancy

based on Efficient search in large textual collections with redundancy by Jiangong Zhang, Torsten Suel

Renaud Delbru


Date:

12th of March 2008


Original Paper:

portal.acm.org/c...


Minimal Deductive systems for RDF

based on Minimal Deductive Systems for RDF by Sergio Munoz, Jorge Pérez, Claudio Gutierrez

Ratnesh Sahay


Date:

27th of February 2008


Original Paper:

www.eswc2007.org...


Semantics and Complexity of SPARQL

Richard Cyganiak

Abstract

Perez et al., Semantics and Complexity of SPARQL. SPARQL is the W3C candidate recommendation query language for RDF. In this paper we address systematically the formal study of SPARQL, concentrating in its graph pattern facility. We consider for this study simple RDF graphs without special semantics for literals and a simplified version of filters which encompasses all the main issues. We provide a compositional semantics, prove there are normal forms, prove complexity bounds, among others that the evaluation of SPARQL patterns is PSPACE-complete, compare our semantics to an alternative operational semantics, give simple and natural conditions when both semantics coincide and discuss optimization procedures.


Date:

30th of January 2008


Material (Slides):

iswc2006.semanti...


Introducing Time into RDF

Thomas Krennwallner

Abstract

The talk will investigate the following paper:

Claudio Gutierrez, Carlos A. Hurtado, Alejandro A. Vaisman: Introducing Time into RDF. IEEE Trans. Knowl. Data Eng. 19(2): 207-218. 2007

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4039284 



Date:

16th of January 2008


Ontology mining, evaluation and embedding in a larger modeling framework

Dr. Peter SpynsFree University of Brussels (VUB)


Date:

17th of December 2007


Answer Set Programming for the Semantic Web

Axel Polleres

Abstract

This is a fast-forward version of the Tutorial given at ESWC 2006 (see http://www.eswc2006.org/tutorials.html#tutorial1).  From this original full-day tutorial, we plan to cover Unit1, parts of Unit2, and Unit 4. This shall give a gentle introduction to Answer Set programming and also clarify some differences/relations between Answer Set Programming (and Logic
Programming in general) to SW languages like OWL and RDF. DERI


Date:

21st of November 2007


Material (Slides):

asptut.gibbi.com


Semantic Web technologies in life science and health care

Matthias Samwald

Abstract

The life science and health care sector has the potential to become a driving force for the development of the Semantic Web. The problems with information integration in the vast field spanning basic research on cells and proteins up to the development of new clinical therapies are starting to put a noticeable halt to scientific progress – putting the lives of patients waiting for new treatments in danger. The Semantic Web technologies are seen by many experts as one of the best long-term solutions to this problem.
The W3C Semantic Web Health Care and Life Sciences Interest Group has made
good progress in demonstrating the value of the Semantic Web for
large-scale information integration of heterogeneous information
resources, which will be presented in this talk.
My presentation will consist of three parts: First, I will showcase the
"HCLS Demo", a large infrastructure of linked RDF/OWL data from the life
sciences with hundreds of millions of RDF triples.
Second, I will highlight some of the specific approaches, needs and
problems encountered when dealing with Semantic Web technologies in the
life science area.
Third, I will point out similarities between current projects at DERI and
projects in the life science community – with the goal of identifying
overlaps and possibilities for future collaborations.

http://www.w3.org/2001/sw/hcls/
http://neuroscientific.net/curriculumW3C Semantic Web Health Care and Life Sciences


Date:

09th of October 2007


Reward Oriented Packet Filtering Algorithm for Wireless Sensor Networks

Lei Shu


Date:

03rd of October 2007


VIP Bridge: Leading Ubiquitous Sensor Networks to the Next Generation

Lei Shu


Date:

26th of September 2007


Stream Data Gathering in Wireless Sensor Networks within Expected Lifetime

Lei Shu


Date:

05th of September 2007


Foster, ME, White, M (2004). Techniques for text planning with XSLT.

Brian Davis


Date:

25th of July 2007


Material (Slides):

www.hcrc.ed.ac.u...


Software Testing

Radu Ciora


Date:

11th of July 2007


Material (Slides):

books.google.com...


Towards European Patient Summaries based on Triple Space Computing

Doug Foxvog


Date:

16th of May 2007


Material (Slides):

www.tripcom.org/...


Exploiting Similarity for Multi-Source Downloads using File Handprints

Sanaullah Nazir


Date:

09th of May 2007


Material (Slides):

www.cs.cmu.edu/~...


Analysis, Dissemination, Visualization, Insight and Semantic Enhancement

Andreas Harth


Date:

18th of April 2007


Material (Slides):

portal.acm.org/c...


Context-Awareness in Health Care: A Review

Ratnesh Sahay


Date:

11th of April 2007


Material (Slides):

dx.doi.org


Web Service Mining and Verification of Properties: An Approach Based on Event Calculus

Walid Gaaloul


Date:

04th of April 2007


Material (Slides):

www.springerlink...


Introduction to Social Network Analysis

Sheila Kinsella


Date:

28th of March 2007


Material (Slides):

sw.deri.org/wiki...


Sensemaking - Information Understanding in Large Document Collections Note: rescheduled to Thursday!

Tomasz Woroniecki


Date:

22nd of March 2007


Material (Slides):

csdl.computer.or...


Human Computation: Play a Game to Develop an Ontology

Peyman Nasirifard


Date:

14th of March 2007


Material (Slides):

www.cs.cmu.edu/~...


The Scientific Method in Software Evaluation

Benjamin Heitmann, Eyal Oren


Date:

06th of March 2007


Material (Slides):

sw.deri.org/wiki...


On (proper) Annotation Methodology or How Linguistics Overcame the Arbitrary Age

Alexander Schutz


Date:

28th of February 2007


Material (Slides):

sw.deri.org/wiki...


Towards Semantic-driven, Flexible and Scalable Framework for Peering and Querying e-Catalog Communities (Pt. II)

Sami Bhiri


Date:

21st of February 2007


Spreading Activation

Smitashree Choudhury


Date:

07th of February 2007


Material (Slides):

sw.deri.org/wiki...


Speech Acts

Simon Scerri


Date:

31st of January 2007


Material (Slides):

sw.deri.org/wiki...


Introduction to eLearning

Siobhán Dervan


Date:

24th of January 2007


Material (Slides):

internettime.com...


Introduction to P2P Systems (Pt III)

Manfred Hauswirth


Date:

13th of December 2006


IT Competitive Strategy: A General Overview of Strategic Alignment

Tadhg Nagel


Date:

06th of December 2006


Material (Slides):

sw.deri.org/wiki...


Paper: "Toward Human-Level Machine Intelligence,"

Laurentiu Vasiliu


Date:

29th of November 2006


Introduction to P2P Systems (Pt II) - starts at 13:00!!

Manfred Hauswirth


Date:

22nd of November 2006


Introduction to P2P Systems (Pt I)

Manfred Hauswirth


Date:

15th of November 2006


Material (Slides):

lsirpeople.epfl....


Paper: "A Survey of Approaches to Automatic Schema Matching"

Xia Wang


Date:

01st of November 2006


Material (Slides):

sw.deri.org/wiki...


Paper: Towards Ontologies for Formalizing Modularization and Communication in Large Software Systems

Matthew Moran


Date:

11th of October 2006


Material (Slides):

www.aifb.uni-kar...


Paper: The Semantic Web - The Roles of XML and RDF

Knud Möller


Date:

04th of October 2006


Material (Slides):

citeseer.ist.psu...


Paper: Content-Based Multimedia Information Retrieval: State of the Art and Challenges

Fergal Monaghan


Date:

20th of September 2006


Material (Slides):

portal.acm.org/f...


The Role of Myth in Information System Design

Bill McDaniel


Date:

13th of September 2006


Browsing information with TreeMaps

Sebastian Ryszard Kruk


Date:

06th of September 2006


Material (Slides):

del.icio.us/skru...


A user-interface framework for text searches

Kieran Hannon


Date:

30th of August 2006


Material (Slides):

www.dlib.org/dli...


Book: Designing Scalable Websites: Building, Scaling, and Optimizing the Next Generation of Web Applications; Cal Henderson

Gearoid Hynes


Date:

16th of August 2006


Material (Slides):

sw.deri.org/wiki...


Web Service Registries

Kashif Iqbal


Date:

26th of July 2006


Material (Slides):

www.infosys.tuwi...


CLIE - Controlled Language for Information Extraction

Brian Davis


Date:

19th of July 2006


Material (Slides):

gate.ac.uk/sale/...


Linking in the Wild

Ann Johnston


Date:

11th of July 2006


Material (Slides):

dx.doi.org/10.10...


Human Memory Augmentation

Cliodhna Hurst


Date:

07th of July 2006


Material (Slides):

sw.deri.org/wiki...


The Social Web: Creating An Open Social Network with XDI

Ina O'Murchu


Date:

03rd of July 2006


Material (Slides):

journal.planetwo...


Complexity and expressive power of logic programming

Andreas Harth


Date:

30th of June 2006


Business Process Management

Armin Haller


Date:

14th of June 2006


Ways of Capturing and Calculating Trust in On-line Communities

Slawomir Grzonkowski


Date:

31st of May 2006


The database research self-assessment

Sami Bhiri


Date:

24th of May 2006


Material (Slides):

portal.acm.org/c...


Expanding the Notion of Links

Pat Croke


Date:

17th of May 2006


Material (Slides):

portal.acm.org/c...


XUL - XML User Interface Language

Uldis Bojars


Date:

10th of May 2006


Material (Slides):

sw.deri.org/wiki...


PageRank and Related Methods

John Breslin


Date:

12th of April 2006


Material (Slides):

sw.deri.org/~jbr...


Ontologizing EDI Semantics

Doug Foxvog


Date:

04th of April 2006


History of Programming Languages

Tudor Groza


Date:

28th of March 2006


Material (Slides):

sw.deri.org/wiki...


Domain Modelling

Antonio Aguilar


Date:

21st of March 2006


History of the WWW

Eyal Oren


Date:

14th of March 2006


Material (Slides):

sw.deri.org/wiki...


Visualization Providers for ontologies elements, based on wikis. A practical approach with demo included

Mariano Rico


Date:

21st of February 2006


Material (Slides):

sw.deri.org/wiki...


Firefox Extensions

Kim Tighe


Date:

14th of February 2006


Paper: D. Maynard, M. Yankova, A. Kourakis, and A. Kokossis. Ontology-based information extraction for market monitoring and technology watch. In ESWC Workshop ?End User Aspects of the Semantic Web?, Heraklion, Crete, 2005.

VinhTuan Thai


Date:

07th of February 2006


Material (Slides):

kmi.open.ac.uk/e...


DEMO - Design & Engineering Methodology for Organizations

Eyal Oren


Date:

31st of January 2006


Semantic Peer-to-Peer Overlay Networks

Brahmananda Sapkota


Date:

24th of January 2006


Material (Slides):

sw.deri.org/wiki...


Metadata Discourse in Business Reports

Sean O'Riain


Date:

13th of December 2005


Context

Ann Johnston


Date:

06th of December 2005


Automated Business-to-Business Integration of a Logistics Supply Chain using Semantic Web Services Technology

Paavo Kotinurmi


Date:

29th of November 2005


Material (Slides):

sw.deri.org/wiki...


Metadata-enabled File Systems

Knud Möller


Date:

22nd of November 2005


Material (Slides):

sw.deri.org/wiki...


Annotea

Pat Croke


Date:

01st of November 2005


WSMO Ontology Management

Mick Kerrigan


Date:

25th of October 2005


Distributed Query Processing

Matteo Magni


Date:

27th of September 2005


Conversations with Web Services

Juan Miguel Gomez


Date:

20th of September 2005


Collaborative Ontology Management

Michal Wozniak, Pawel Szczecki, Piotr Piotrowski


Date:

13th of September 2005


Semantic Wikis

Eyal Oren


Date:

06th of September 2005


Jack

Ann Johnston


Date:

12th of July 2005


SWS Digital Libraries

Sebastian Ryszard Kruk


Date:

05th of July 2005


Spatial Predicates for Semantic Web Metadata

Marco Neumann


Date:

28th of June 2005


AJAX

Kim Tighe


Date:

16th of June 2005


SIOC

Uldis Bojars


Date:

17th of May 2005


Speech Acts and Semantic Communication

Lars Ludwig


Date:

10th of May 2005


Co-location

Cliodhna Hurst


Date:

03rd of May 2005


Social Networking and Collaboration Tools

Ina O'Murchu


Date:

26th of April 2005


RSS Blogging on the SW

John Breslin


Date:

19th of April 2005


WS Reliability

Aine Leddy


Date:

05th of April 2005


Jump to top.
Valid XHTML 1.0 TransitionalRDF Resource Description Framework Icon
(C) Copyright 2004-2010 by the Digital Enterprise Research Institute (DERI). All rights reserved.
DERI Locations

Events