Pipelined Query Processing in Coprocessor
Link to publication 
Link to original publication 
Download Bibtex entry
International Conference on Management of Data (SIGMOD 2018)
processing on GPU-style coprocessors is severely limited by the
movement of data. With teraflops of compute throughput in one device,
even high-bandwidth memory cannot provision enough data for a
Query compilation is a proven technique to improve memory efficiency.
However, its inherent tuple-at-a-time processing style does not suit
the massively parallel execution model of GPU-style coprocessors. This
compromises the improvements in efficiency offered by query
compilation. In this paper, we show how query compilation and
GPU-style parallelism can be made to play in unison nevertheless. We
describe a compiler strategy that merges multiple operations into a
single GPU kernel, thereby significantly reducing bandwidth demand.
Compared to operator-at-a-time, we show reductions of memory access
volumes by factors of up to 7.5x resulting in shorter kernel execution
times by factors of up to 9.5x.
------ Links: ------