CoGaDB–Column-oriented GPU-accelerated DBMS

CoGaDB Logo

CoGaDB is a column-oriented coprocessor-accelerated database management system developed at the DIMA group at TU Berlin and the IAM group at German Research Center for Artificial Intelligence (DFKI GmbH). With its new query compiler Hawk, it focuses on specialized code generation for different heterogeneous processors, such as multi-core CPUs, GPUs, or MICs (Intel Xeon Phi Coprocessors). The key idea of Hawk is to automatically adapt generated code of the query compiler until it produces near optimal code for a specific processor, query workload, and data set. At the same time, Hawk targets efficient and scalable collaborative query processing over multiple heterogeneous processors in the same machine.

Overview of Hawk–A Hardware Adaptive Query Compiler

falcon_final_logo_128

The performance of modern processors is primarily bound by a fixed energy budget. This power wall forces processor vendors to specialize their processors to certain applications to provide the speedups users expect. This specialization leads to a landscape of heterogeneous processors, where each processor has unique properties and capabilities. Optimizing database systems to processors with different architectures is a very costly task, because database operators need to be re-implemented and optimized for each new processor by expert programmers.

The goal of the Hawk project is to develop concepts that allow database systems to adapt themselves automatically to such heterogeneous, previously unknown processors. This adaptive optimization avoids manual per-processor tuning. The core idea of the project is to introduce variations to database operators (e.g., code optimizations, data structures and parallelization strategies). These variations allows us to generate custom per-processor implementations of database operators. As this variant generation creates a large search space, we will research concepts for a query optimizer that can select efficient implementations of an operator for each processor automatically without prior knowledge of the processors used.

The Hawk project is funded by the German Research Foundation (DFG) as part of the priority program on Scalable Data Management for Future Hardware.