Latest News

Paper Accepted at VLDB

T. Vincon, C. Knoedler, L. Solis-Vasquez, A. Bernhardt, S. Tamimi, L. Weber, F. Stock, A. Koch, I. Petrov. Near-Data Processing in Database Systems on Native Computational Storage under HTAP Workloads. In Proc. VLDB 2022.

In this paper we show that Near-Data Processing (NDP) naturally fits in the HTAP design space. We propose an architecture for update-aware NDP, allowing transactionally consistent in-situ executions of analytical operations in presence of concurrent updates in HTAP settings.

Abstract:

In this paper we show that Near-Data Processing (NDP) naturally fits in the HTAP design space. We propose an architecture for update-aware NDP, allowing transactionally consistent in-situ executions of analytical operations in presence of concurrent updates in HTAP settings.

Paper Accepted at EDBT

B. Moessner, C. Riegger, A. Bernhardt, I. Petrov. bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing. In Proc. EDBT 2023.

We introduce bloomRF as a unified point-range filter that extends Bloom-filters with range-queries.

Abstract:

We introduce bloomRF as a unified PRF that extends BFs with range- lookups. We propose novel prefix hashing to encode range information in the hash-code of the key, and novel PMHF for fast lookups and #fewer memory accesses. We describe basic bloomRF that is simple and tuning-free, and propose optimizations for han- dling larger ranges. bloomRF has near-optimal space- and constant query-complexity and outperforms existing PRF by up to 4x.

Paper Accepted VLDBJ

T. Bang, N. May, I. Petrov, C. Binnig. The full story of 1000 cores. In VLDB Journal (2022).

In this paper, we further extend our analysis from DaMoN 2020, detailing the effect of hardware and workload characteristics via additional real hardware platforms (IBM Power8 and 9) and the full TPC-C transaction mix.

Paper Accepted at DAMON

T. Vincon, C. Knoedler, A. Bernhardt, L. Solis-Vasquez, L. Weber, A. Koch, I.Petrov. Result-Set Management for NDP Operations on Smart Storage. In Proc. DAMON 2022.

In this work, we introduce a set of in-situ NDP result-set management techniques, such as spilling, materialization, and reuse.

Abstract:

Paper Accepted at FCCM

S. Tamimi, F. Stock, A. Bernhardt, I. Petrov, A. Koch. An Evaluation of Using CCIX for Cache-Coherent Host-FPGA Interfacing. In Proc. FCCM 2022.

In this work, we compare-and-contrast the use of CCIX with PCIe when interfacing an ARM-based host with two generations of CCIX-enabled FPGAs. We provide both low-level throughput and latency measurements for accesses and address translation, as well as examine an application-level use-case of using CCIX for fine-grained synchronization in an FPGA-accelerated database system

Abstract:

For a long time, most discrete accelerators have been attached to host systems using various generations of the PCI Express interface. However, with its lack of support for coherency between accelerator and host caches, fine-grained interactions require frequent cache-flushes, or even the use of inefficient uncached memory regions. The Cache Coherent Interconnect for Accelerators (CCIX) was the first multi-vendor standard for enabling cache-coherent host-accelerator attachments, and already is indicative of the capabilities of upcoming standards such as Compute Express Link (CXL). In our work, we compare-and-contrast the use of CCIX with PCIe when interfacing an ARM-based host with two generations of CCIX-enabled FPGAs. We provide both low-level throughput and latency measurements for accesses and address translation, as well as examine an application-level use-case of using CCIX for fine-grained synchronization in an FPGA-accelerated database system. We can show that especially smaller reads from the FPGA to the host can benefit from CCIX by having roughly 33% shorter latency than PCIe. Small writes to the host have a latency roughly 32% higher than PCIe, though, since they carry a higher coherency overhead. For the database use-case, the use of CCIX allowed to maintain a constant synchronization latency even with heavy host-FPGA parallelism..

Paper Accepted at ICDE

A. Bernhardt, S. Tamimi, T. Vincon, C. Knoedler, F. Stock, C Heinz, A. Koch, I. Petrov. neoDBMS: In-situ Snapshots for Multi-Version DBMS on Native Computational Storage In Proc. ICDE 2022.

In this paper, we showcase how neoDBMS performs snapshot computation in-situ.

Abstract:

Multi-versioning and MVCC are the foundations of many modern DBMSs. Under mixed workloads and large datasets, the creation of the transactional snapshot can become very expensive, as long-running analytical transactions may request old versions, residing on cold storage, for reasons of transactional consistency. Furthermore, analytical queries operate on cold data, stored on slow persistent storage. Due to the poor data locality, snapshot creation may cause massive data transfers and thus lower performance. Given the current trend towards computational storage and near-data processing, it has become viable to perform such operations in-storage to reduce data transfers and improve scalability. neoDBMS is a DBMS designed for near-data processing and computational storage. In this paper, we demonstrate how neoDBMS performs snapshot computation in-situ. We showcase different interactive scenarios, where neoDBMS outperforms PostgreSQL 12 by up to 5x.

Paper Accepted at EDBT

A. Bernhardt, S. Tamimi, F. Stock, A. Koch, T. Vincon, I. Petrov Cache-Coherent Shared Locking for Transactionally Consistent Updates in Near-Data Processing DBMS on Smart Storage. In Proc. EDBT 2022.

We introduce a low-latency cache-coherent shared lock table for update NDP settings. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage

Abstract:

Even though near-data processing (NDP) can provably reduce data transfers and increase performance, current NDP is solely utilized in read-only settings. Slow or tedious to implement synchronization and invalidation mechanisms between host and smart storage makeNDP support for data-intensive update operations difficult. In this paper, we introduce a low-latency cache-coherent shared lock table for update NDP settings. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage. Our evaluation indicates end-to-end lock latencies of ∼80-100ns and robust performance under contention.

Paper Accepted at ADBIS

C. Riegger, I. Petrov. Storage Management with Multi-Version Partitioned BTrees. In Proc. ADBIS 2022.

We propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores.

Abstract:

Database Management Systems and K/V-Stores operate on updatable datasets – massively exceeding the size of available main mem- ory. Tree-based K/V storage management structures became particularly popular in storage engines. B+-Trees [1,4] allow constant search perfor- mance, however write- heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics. LSM- Trees overcome this issue by horizontal partitioning fractions of data – small enough to fully reside in main memory, but require frequent maintenance to sustain search performance. Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores. Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal partitioning in MV-PBT allows leveraging recent advances in modern B+-Tree techniques in a small transparent and mem- ory resident portion of the structure. Structural properties sustain steady read performance, yielding efficient write patterns and reducing write amplification. We integrated MV-PBT in the WiredTiger KV storage engine. MV- PBT offers an up to 2x increased steady throughput in comparison to LSM-Trees and several orders of magnitude in comparison to B+-Trees in a YCSB workload.