PARA'04 State-of-the-Art
in Scientific Computing
June 20-23, 2004 (Home page)

Updated: 7 February 2004

Data Cube Query Involving Parallel Index in High Performance Databases

Rebecca Boon-Noi Tan & David Taniar
School of Business Systems
Monash University
Clayton, Victoria 3800
Australia
emails: {Rebecca.Tan,David.Taniar}@infotech.monash.edu.au

Data warehousing is a key technology in everyday activity, ranging from professional activities to entertainment. The objective of a data warehouse is to provide analysts and managers with strategic information underlying the business. Unfortunately, corporate data keep on increasing exponentially and the process of analysis usually involves queries that need aggregates, filters, and grouping of data in a variety of ways. As a result, queries and analyses are becoming more complex and time consuming. Note that analyst or manager using data warehouse has time constraints. Therefore, there is a crucial need to improve the query performance and to provide faster response time.

In order to achieve the maximum performance requirement and response times, parallel database system and also parallel algorithms are being vigorously exploited [1 & 2]. Why is parallelism appropriate for data warehouse? The main reason is that parallel systems can be constructed at a low cost without the need for any specialized technology, by using existing sequential computers and relatively cheap interconnection networks in today's environment.

It is the aim of this paper to focus on parallel algorithms for bi-selection data cube queries in data warehouse environment. To supplement this, the parallel bi-selection data cube queries make use of index that already exists in databases. This is a flexible and cost-effective way to further enhance the performance and response times in data warehouses. The work presented in this paper is part of a larger project on Parallel Indexing in Parallel Database Systems. Part of the research results from this project has been reported in [3 & 4]. In this paper, we present the theoretical and experimental of the parallel index in the bi-selection data cube query. Our experimental results indicated some promising indexing schemes for supporting the parallel bi-selection data cube query processing.

References:
1. Dehne F., Eavis, T., and Rau-Chaplin, A., "A cluster architecture for parallel data warehousing", Proceedings of First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 161 -168, 2001
2. Märtens H., Rahm E., and Stöhr T, "Dynamic Query Scheduling in Parallel Data Warehouses", Proceedings of 8th International conference on Euro-Par, LNCS, Paderborn, Germany, pp. 321-331, 2002.
3. Taniar, D. and Rahayu, J.W., "A Taxonomy of Indexing Schemes for Parallel Database Systems", Distributed and Parallel Databases: An International Journal, 12(1), pp. 73-106, 2002.
4. Taniar, D. and Rahayu, J.W., "Chapter 17: Parallel Join Query Algorithms Involving Index", Parallel and Distributed Computing Applications and Technologies, C.S.Leung, J.Sum, C.L.Wang, and G.H.Young (eds.), ISBN: 962-85887-1-0, The University of Hong Kong, pp. 133-140, 2000.

Home page


2004-02-07