Continuous Query Optimization and Evaluation over Unified Linked Stream Data and Linked Open Data

Technical Report
In this report we address the problem of scalable query processing over Linked Stream Data integrated with Linked Open Data. Linked Stream Data consists of data generated by stream sources, e.g., sensors, enriched with semantic descriptions, following the standards proposed for Linked Data. This will enable the easy integration of sensor data with the quickly growing amount of Linked Open Data and facilitate the use of the large body of existing software along with a wide range of novel applications. However, the highly dynamic nature of sensor data requires new approaches for data management and processing which are not supported by existing systems. To remedy this, we present our Continuous Query Evaluation over Linked Streams (CQELS) approach which provides a scalable query processing model for unified Linked Stream Data and Linked Open Data. Scalability in CQELS is achieved by applying state-of-the-art techniques for efficient data storage and query pre-processing, combined with a new adaptive cost-based query optimization algorithm for dynamic data sources, such as sensor streams. In traditional Database Management Systems (DBMS), query optimizers use pre-computed selectivity values for the data to decide on the best execution plan, whereas with continuous query over stream data the data – and consequently its selectivity values – varies over time. This means that the optimal execution plan itself can vary throughout the execution of the query. To overcome this problem, the CQELS query optimizer retains a subset of the possible execution plans and, at query time, updates their respective costs and chooses the least expensive one for executing the query at this given point in time. We have implemented CQELS and our experimental results show that CQELS can greatly reduce query response times while scaling to a realistically high number of parallel queries.
Research Unit: 
Sensor Middleware Unit