Abstract
Skyline computation aims to find the points that are not dominated by any other point in the dataset. It has been becoming a hot topic due to its potential applications in real-time online services. Usually, such applications expect to return the first Skyline point quickly, without ran-sacking all the points. This paper focuses on the problem of progressive subspace Skyline queries in high dimensional space. To the best of our knowledge, the existing algorithms and their variations cannot be easily extended to support arbitrary subspace Skyline query efficiently. The BNL (Blocked Nested Loop) method can be used for subspace Skyline queries, but it is very inefficiently, and not progressive. A novel algorithm, called CSky(stands for Count the Skyline), is proposed in this paper to solve this problem. With CSky, the Skyline points can be rapidly obtained in any query subspace of a high dimensional space. It is an online algorithm based on a novel data structure called InvertS. The algorithm scans the dataset at most one pass, and the points are only compared with those detected Skyline points, thus resulting in a much smaller computation overhead in comparison with BNL. Furthermore, CSky not only can efficiently find the Skyline like other index-based algorithms, but also do it in any subspace of the whole query space. The theoretical analysis and experimental comparison show that the CSky algorithm outperforms BNL significantly in various kinds of application cases.
| Original language | English |
|---|---|
| Pages (from-to) | 1409-1417 |
| Number of pages | 9 |
| Journal | Jisuanji Xuebao/Chinese Journal of Computers |
| Volume | 30 |
| Issue number | 8 |
| State | Published - Aug 2007 |
| Externally published | Yes |
Keywords
- Online algorithm
- Progressive algorithm
- Skyline
- Subspace