News
Paper Accepted - ACM TRETS
S. Tamimi, A. Bernhardt, F. Stock, I. Petrov, A. Koch DANSEN: Database Acceleration on Native Computational Storage by Exploiting NDP ACM Transactions on Reconfigurable Technology and Systems (TRETS).
This paper presents the accelerator components for neoDBMS, a full-stack computational storage system designed to manage on-device execution of database queries/transactions as a Near-Data Processing operation.
Abstract:
This paper introduces DANSEN, the hardware accelerator component for neoDBMS, a full-stack computational storage system designed to manage on-device execution of database queries/transactions as a Near-Data Processing (NDP)-operation. The proposed system enables Database Management Systems (DBMS) to oload NDP-operations to the storage while maintaining control over data through a native storage interface. DANSEN provides an NDP-engine that enables DBMS to perform both low-level database tasks, such as performing database administration, as well as high-level tasks like executing SQL, on the smart storage device while observing the DBMS concurrency control. Furthermore, DANSEN enables the incorporation of custom accelerators as an NDP-operation, e.g., to perform hardware-accelerated ML inference directly on the stored data. We built the DANSEN storage prototype and interface on an Ultrascale+HBM FPGA and fully integrated it with PostgreSQL. Experimental results demonstrate that the proposed NDP approach outperforms software-only PostgreSQL using a fast of-the-shelf NVMe drive, and signiicantly improves the end-to-end execution time of an aggregation operation by 10.6x. The versatility of the proposed approach is also validated by integrating a compute-intensive data analytics application with multi-row results, outperforming PostgreSQL by 1.5x.
Paper Accepted
C. Riegger and I. Petrov. Storage management with multi-version partitioned BTrees, Information Systems. In Information Systems (2024).
In this paper, we propose MV- PBT as sole storage and index structure in key-sorted storage engines, with up to 2x better throughput.
Abstract:
We propose Multi-Version Partitioned BTrees (MV- PBT) as sole storage and index management structure in key-sorted storage engines like Key/Value-Stores. Secondly, we compare MV-PBT against LSM-Trees. We demonstrate up to 2x better steady throughput over LSM-Trees and several orders of magnitude in over B+-Trees in a YCSB workload. Moreover, MV-PBT exhibits robust time-travel query performance and outperforms LSM-Trees by 20% and B+-Trees by an order of magnitude.
Invited Talk
Paper Accepted - ACM TRETS
S. Tamimi, A. Bernhardt, F. Stock, I. Petrov, A. Koch DANSEN: Database Acceleration on Native Computational Storage by Exploiting NDP ACM Transactions on Reconfigurable Technology and Systems (TRETS).
This paper introduces hardware accelerator components for neoDBMS, a full-stack computational storage system designed to manage on-device execution of database queries/transactions as a Near-Data Processing operation.
Abstract:
This paper introduces DANSEN, the hardware accelerator component for neoDBMS, a full-stack computational storage system designed to manage on-device execution of database queries/transactions as a Near-Data Processing (NDP)-operation. The proposed system enables Database Management Systems (DBMS) to oload NDP-operations to the storage while maintaining control over data through a native storage interface. DANSEN provides an NDP-engine that enables DBMS to perform both low-level database tasks, such as performing database administration, as well as high-level tasks like executing SQL, on the smart storage device while observing the DBMS concurrency control. Furthermore, DANSEN enables the incorporation of custom accelerators as an NDP-operation, e.g., to perform hardware-accelerated ML inference directly on the stored data. We built the DANSEN storage prototype and interface on an Ultrascale+HBM FPGA and fully integrated it with PostgreSQL. Experimental results demonstrate that the proposed NDP approach outperforms software-only PostgreSQL using a fast of-the-shelf NVMe drive, and signiicantly improves the end-to-end execution time of an aggregation operation by 10.6x. The versatility of the proposed approach is also validated by integrating a compute-intensive data analytics application with multi-row results, outperforming PostgreSQL by 1.5x.
Paper Accepted at ARC 2023
S. Tamimi, A. Bernhardt, F. Stock, I. Petrov, A. Koch NVMulator: A Configurable Open-Source Non-volatile Memory Emulator for FPGAs In Proc. ARC 2023.
NVMulator is an open-source easy-to-use hardware emulation module that can be seamlessly inserted between the NDP processing elements on the FPGA and a conventional DRAM-memory.
Abstract:
We present NVMulator, an open-source easy-to-use hardware emulation module that can be seamlessly inserted between the NDP processing elements on the FPGA and a conventional DRAM- based memory system. We demonstrate that, with suitable parametrization, the emulated NVM can come very close to the performance char- acteristics of actual NVM technologies, specifically Intel Optane. We achieve 0.62% and 1.7% accuracy for cache line sized accesses for read and write operations, while utilizing only 0.54% of LUT logic resources on a Xilinx/AMD AU280 UltraScale+ FPGA board. We consider both file-system as well as database access patterns, examining the operation of the RocksDB database when running on real or emulated Optane- technology memories.
Invited Talk
Paper Accepted at DAMON 2023
A. Bernhardt, A. Koch, I. Petrov. pimDB: From Main-Memory DBMS to Processing-In-Memory DBMS-Engines on Intelligent Memories. In Proc. DAMON 2023.
In this paper, we introduce pimDB and provide an initial comparison of processor-centric and PIM-DBMS approaches under different aspects, such as scalability and parallelism, cache-awareness, or PIM-specific compute/bandwidth tradeoffs.
Abstract:
In this paper, we introduce pimDB and provide an initial comparison of processor-centric and PIM-DBMS approaches under different aspects, such as scalability and parallelism, cache-awareness, or PIM-specific compute/bandwidth tradeoffs.
Ph.D. Defence
Christian Riegger has successfully defended his dissertation "Multi-version Indexing for large Datasets with high-rate continuous Insertions". Congratulations on behalf of the whole DBlab team!
Best Paper Award EDBT'23
Our paper bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing has been awarded a Best Paper Award at (EDBT 2023)
We are extremely happy about the recognition and wish to thank the committee for considering our work.HiPEAC 2022 Paper Award
Our paper S. Tamimi, F. Stock, A. Bernhardt, I. Petrov, A. Koch. An Evaluation of Using CCIX for Cache-Coherent Host-FPGA Interfacing. In Proc. FCCM 2022. has been awarded a HiPEAC 2022 Paper Award.
Ph.D. Defence
Tobias Vincon has successfully defended his dissertation "Data-Intensive Systems on Modern Hardware: leveraging near-data processing to counter the growth of data".
Summer School
Arthur Bernhard attends the st Summer School on Scalable Data Management for Future Hardware
BW-CAR Membership
Ilia Petrov has been elected a member of the
Baden-Württemberg Center of Applied Research (BW-CAR) and is a founding member of the
doctoral consortium (Promotionsverband)
Baden-Württemberg
Paper Accepted at VLDB
T. Vincon, C. Knoedler, L. Solis-Vasquez, A. Bernhardt, S. Tamimi, L. Weber, F. Stock, A. Koch, I. Petrov: Near-Data Processing in Database Systems on Native Computational Storage under HTAP Workloads. In Proc. VLDB 2022.
In this paper we show that Near-Data Processing (NDP) naturally fits in the HTAP design space. We propose an architecture for update-aware NDP, allowing transactionally consistent in-situ executions of analytical operations in presence of concurrent updates in HTAP settings.
Abstract:
In this paper we show that Near-Data Processing (NDP) naturally fits in the HTAP design space. We propose an architecture for update-aware NDP, allowing transactionally consistent in-situ executions of analytical operations in presence of concurrent updates in HTAP settings.
Paper Accepted VLDBJ
T. Bang, N. May, I. Petrov, C. Binnig. The full story of 1000 cores. In VLDB Journal (2022).
In this paper, we further extend our analysis from DaMoN 2020, detailing the effect of hardware and workload characteristics via additional real hardware platforms (IBM Power8 and 9) and the full TPC-C transaction mix.
Paper Accepted EDBT
B. Moessner, C. Riegger, A. Bernhardt, I. Petrov. bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing. In Proc. EDBT 2023.
We introduce bloomRF as a unified point-range filter that extends Bloom-filters with range-queries.
Paper Accepted at ADBIS
C. Riegger, I. Petrov. Storage Management with Multi-Version Partitioned BTrees. In Proc. ADBIS 2022.
We propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores.
Abstract:
Database Management Systems and K/V-Stores operate on updatable datasets â massively exceeding the size of available main mem- ory. Tree-based K/V storage management structures became particularly popular in storage engines. B+-Trees [1,4] allow constant search perfor- mance, however write- heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics. LSM- Trees overcome this issue by horizontal partitioning fractions of data â small enough to fully reside in main memory, but require frequent maintenance to sustain search performance. Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores. Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal partitioning in MV-PBT allows leveraging recent advances in modern B+-Tree techniques in a small transparent and mem- ory resident portion of the structure. Structural properties sustain steady read performance, yielding efficient write patterns and reducing write amplification. We integrated MV-PBT in the WiredTiger KV storage engine. MV- PBT offers an up to 2x increased steady throughput in comparison to LSM-Trees and several orders of magnitude in comparison to B+-Trees in a YCSB workload.
Paper Accepted at DAMON
T. Vincon, C. Knoedler, A. Bernhardt, L. Solis-Vasquez, L. Weber, A. Koch, I.Petrov. Result-Set Management for NDP Operations on Smart Storage. In Proc. DAMON 2022.
In this work, we introduce a set of in-situ NDP result-set management techniques, such as spilling, materialization, and reuse.
Abstract:
Paper Accepted at FCCM 2022
S. Tamimi, F. Stock, A. Bernhardt, I. Petrov, A. Koch. An Evaluation of Using CCIX for Cache-Coherent Host-FPGA Interfacing. In Proc. FCCM 2022.
In this work, we compare-and-contrast the use of CCIX with PCIe when interfacing an ARM-based host with two generations of CCIX-enabled FPGAs. We provide both low-level throughput and latency measurements for accesses and address translation, as well as examine an application-level use-case of using CCIX for fine-grained synchronization in an FPGA-accelerated database system
Abstract:
For a long time, most discrete accelerators have been attached to host systems using various generations of the PCI Express interface. However, with its lack of support for coherency between accelerator and host caches, fine-grained interactions require frequent cache-flushes, or even the use of inefficient uncached memory regions. The Cache Coherent Interconnect for Accelerators (CCIX) was the first multi-vendor standard for enabling cache-coherent host-accelerator attachments, and already is indicative of the capabilities of upcoming standards such as Compute Express Link (CXL). In our work, we compare-and-contrast the use of CCIX with PCIe when interfacing an ARM-based host with two generations of CCIX-enabled FPGAs. We provide both low-level throughput and latency measurements for accesses and address translation, as well as examine an application-level use-case of using CCIX for fine-grained synchronization in an FPGA-accelerated database system. We can show that especially smaller reads from the FPGA to the host can benefit from CCIX by having roughly 33% shorter latency than PCIe. Small writes to the host have a latency roughly 32% higher than PCIe, though, since they carry a higher coherency overhead. For the database use-case, the use of CCIX allowed to maintain a constant synchronization latency even with heavy host-FPGA parallelism..
Paper Accepted at ICDE
A. Bernhardt, S. Tamimi, T. Vincon, C. Knoedler, F. Stock, C Heinz, A. Koch, I. Petrov: neoDBMS: In-situ Snapshots for Multi-Version DBMS on Native Computational Storage In Proc. ICDE 2022.
In this paper, we showcase how neoDBMS performs snapshot computation in-situ.
Abstract:
Multi-versioning and MVCC are the foundations of many modern DBMSs. Under mixed workloads and large datasets, the creation of the transactional snapshot can become very expensive, as long-running analytical transactions may request old versions, residing on cold storage, for reasons of transactional consistency. Furthermore, analytical queries operate on cold data, stored on slow persistent storage. Due to the poor data locality, snapshot creation may cause massive data transfers and thus lower performance. Given the current trend towards computational storage and near-data processing, it has become viable to perform such operations in-storage to reduce data transfers and improve scalability. neoDBMS is a DBMS designed for near-data processing and computational storage. In this paper, we demonstrate how neoDBMS performs snapshot computation in-situ. We showcase different interactive scenarios, where neoDBMS outperforms PostgreSQL 12 by up to 5x.
Paper Accepted at EDBT
A. Bernhardt, S. Tamimi, F. Stock, A. Koch, T. Vincon, I. Petrov: Cache-Coherent Shared Locking for Transactionally Consistent Updates in Near-Data Processing DBMS on Smart Storage. In Proc. EDBT 2022.
We introduce a low-latency cache-coherent shared lock table for update NDP settings. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage
Abstract:
Even though near-data processing (NDP) can provably reduce data transfers and increase performance, current NDP is solely utilized in read-only settings. Slow or tedious to implement synchronization and invalidation mechanisms between host and smart storage makeNDP support for data-intensive update operations difficult. In this paper, we introduce a low-latency cache-coherent shared lock table for update NDP settings. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage. Our evaluation indicates end-to-end lock latencies of â¼80-100ns and robust performance under contention.
HardBD/Active 2022
DBlab is co-organissing this year's HardBD/Active Workshop at ICDE 2022. HardBD/Active 2022 has a very strong programme. Follow it online and be part of it!
Both HardBD and Active are interested in exploiting hardware technologies for data-intensive systems. The workshop aims at providing a forum for academia and industry to exchange ideas through research and position papers.
Abstract:
New Paper Accepted
Christian Knoedler, Tobias Vincon, Arthur Bernhardt, Leonardo Solis-Vasquez, Lukas Weber, Ilia Petrov, Andreas Koch. A cost model for NDP-aware query optimization for KV-stores. In Proc. DAMON 2021.
We show the need for optimisations based on a cost model due to the usage of the traditional stack as well as the upcoming execution on computational storage.
Abstract:
Many modern DBMS architectures require transferring data from storage to process it afterwards. Given the continuously increasing amounts of data, data transfers quickly become a scalability limiting factor. Near-Data Processing and smart/computational storage emerge as promising trends allowing for decoupled in-situ operation execution, data transfer reduction and better bandwidth utilization. However, not every operation is suitable for an in-situ execution and a careful placement and optimization is needed. In this paper we present an NDP-aware cost model. It has been implemented in MySQL and evaluated with nKV. We make several observations underscoring the need for optimization.
Paper Accepted at RAW@IPDPS
Lukas Weber, Lukas Sommer, Leonardo Solis-Vasquez, Tobias Vincon, Christian Knoedler, Arthur Bernhardt, Ilia Petrov, Andreas Koch. A Framework for the Automatic Generation of FPGA-based Near-Data Processing Accelerators in Smart Storage Systems. In Proc. Reconfigurable Architectures Workshop. RAW@IPDPS.
We introduce a framework for automatic generation of data format parsers and accessors for NDP DBMS on computational storage
Abstract:
Near-Data Processing is a promising approach to overcome the limitations of slow I/O interfaces in the quest to analyze the ever-growing amount of data stored in database systems. Next to CPUs, FPGAs will play an important role for the realization of functional units operating close to data stored in non-volatile memories such as Flash. It is essential that the NDP-device understands formats and layouts of the persistent data, to perform operations in-situ. To this end, carefully optimized format parsers and layout accessors are needed. However, designing such FPGA-based Near-Data Processing accelerators requires significant effort and expertise. To make FPGA-based Near-Data Processing accessible to non-FPGA experts, we will present a framework for the automatic generation of FPGA-based accelerators capable of data filtering and transformation for key-value stores based on simple data-format specifications. The evaluation shows that our framework is able to generate accelerators that are almost identical in performance compared to the manually optimized designs of prior work, while requiring little to no FPGA-specific knowledge and additionally providing improved flexibility and more powerful functionality.
Paper Accepted at DAPD
L Weber, T. Vincon, C Knoedler, L. Solis-Vasquez, A. Bernhardt, I. Petrov, A. Koch. On the Necessity of Explicit Cross-Layer Data Formats in Near-Data Processing Systems. In Journal of Distributed and Parallel Databases. DAPD.
The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements.
Abstract:
Massive data transfers in modern data-intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-Data processing (NDP) and a shift to code-to-data designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become feasible. The shift towards NDP system architectures calls for revision of established principles. Abstractions such as data formats and layouts typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under RocksDB and the COSMOS hardware platform.
New Preprint Available
Christian Riegger, Arthur Bernhardt, Bernhard Moessner, Ilia Petrov. bloomRF: On Performing Range-Queries with Bloom-Filters based on Piecewise-Monotone Hash Functions and Dyadic Trace-Trees. [arXiv].
bloomRF is a unified method for approximate membership testing that can efficiently perform both point- and range-queries on a single data structure.
Abstract:
We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries on a single data structure. bloomRF extends Bloom-Filters with range query support and may replace them. The core idea is to employ a dyadic interval scheme to determine the set of dyadic intervals covering a data point, which are then encoded and inserted. bloomRF introduces Dyadic Trace-Trees as novel data structure that represents those covering intervals implicitly. A Trace-Tree encoding scheme represents the set of covering intervals efficiently, in a compact bit representation. Furthermore, bloomRF introduces novel piecewise-monotone hash functions that are locally order-preserving and thus support range querying. We present an efficient membership computation method for range-queries. Although, bloomRF is designed for integers it also supports string and floating-point data types. It can also handle multiple attributes and serve as multi-attribute filter. We evaluate bloomRF in RocksDB and in a standalone library. bloomRF is more efficient and outperforms existing point-range-filters by up to 4x across a range of settings.
New DBlab Member
The DBlab team is happy welcome Christian Knoedler on board!
Arthur will strengthen our neoDBMS-Team.New Paper Accepted
T. Vincon, L. Weber, A. Bernhardt, C. Riegger, S. Hardock, C. Knoedler, F. Stock, L. Solis-Vasquez, S. Tamimi, A. Koch, I. Petrov. nKV in Action: Accelerating KV-Stores on Native Computational Storage with Near-Data Processing. In Proc. VLDB 2020.
In this paper we introduce nKV, which is a key/value store utilizing native computational storage and near-data processing.
Abstract:
Massive data transfers in modern data-intensive systems resulting from low data-locality and data-to-code system de- sign hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, has yet to see widespread use. In this paper we demonstrate various NDP alternatives in nKV, which is a key/value store utilizing native computational storage and near-data processing. We showcase the execution of classical operations (GET, SCAN) and complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4x-2.7x better performance due to NDP. nKV runs on real hardware - the COSMOS+ platform.
New Paper Accepted
T. Bang, I. Oukid, N. May, I. Petrov, C. Binnig. Robust Performance of Main Memory Data Structures by Configuration. In Proc. SIGMOD 2020.
In this paper, we present a new approach for achieving robust performance of data structures making it easier to reuse the same design for different hardware generations but also for different workloads.
Abstract:
In this paper, we present a new approach for achieving robust performance of data structures making it easier to reuse the same design for different hardware generations but also for different workloads. To achieve robust performance, the main idea is to strictly separate the data structure design from the actual strategies to execute access operations and adjust the actual execution strategies by means of so-called configurations instead of hard-wiring the execution strategy into the data structure. In our evaluation we demonstrate the benefits of this configuration approach for individual data structures as well as complex OLTP workloads.
New Paper Accepted
T. Bang, N. May, I. Petrov, C. Binnig. The Tale of 1000 Cores: An Evaluation of Concurrency Control on Real(ly) Large Multi-Socket Hardware In Proc. DAMON 2020.
We follow up on this prior work with an evaluation of the characteristics of concurrency control schemes on real production multi-socket hardware with 1568 cores.
Abstract:
In this paper, we set out the goal to revisit the results of âStarring into the Abyss [...] of Concurrency Control with [1000] Coresâ and analyse in-memory DBMSs on todayâs large hardware. Despite the original assumption of the authors, today we do not see single- socket PUs with 1000 cores. Instead multi-socket hardware made its way into production data centres. Hence, we follow up on this prior work with an evaluation of the characteristics of concurrency control schemes on real production multi-socket hardware with 1568 cores. To our surprise, we made several interesting findings which we report on in this paper.
New Paper Accepted
T. Vincon, L. Weber, A. Bernhardt, A. Koch, I. Petrov. nKV: Near-Data Processing with KV-Stores on Native Computational Storage. In Proc. DAMON 2020.
In this paper we introduce nKV, which is a key/value store utilizing native computational storage and near-data processing.
Abstract:
Massive data transfers in modern key/value stores resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, have yet to see widespread use. In this paper we introduce nKV, which is a key/value store utilizing native computational storage and near-data process- ing. On the one hand, nKV can directly control the data and computation placement on the underlying storage hardware. On the other hand, nKV propagates the data formats and layouts to the storage device where, software and hardware parsers and accessors are implemented. Both allow NDP operations to execute in host-intervention-free manner, directly on physical addresses and thus better utilize the underlying hardware. Our performance evaluation is based on execut- ing traditional KV operations (GET, SCAN ) and on complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4x-2.7x better performance on real hardware â the COSMOS+ platform.
New Paper Accepted
T. Vincon, A. Bernhardt, L. Weber, A. Koch, I. Petrov. On the Necessity of Explicit Cross-Layer DataFormats in Near-Data Processing Systems. In Proc. HardBD 2020.
The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements.
Abstract:
Massive data transfers in modern data-intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) and a shift to code-to-data designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become viable. The shift towards NDP system architectures calls for revision of established principles. Abstractions such as data formats and layouts typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under RocksDB and the COSMOS hardware platform.
New Project Grant
pimDB: infrastructure for Processing-In-Memory in modern DBMS
Principle Investigators: Data Management Lab Funding agency: MWK, Baden-Württemberg, Germany
pimDB provides infrastructures for PIM research in modern main-memory DBMS.
New Paper Accepted
C. Riegger, T. Vincon, R. Gottstein, I. Petrov. MV-PBT: Multi-Version Indexing for Large Datasets and HTAP Workloads. In Proc. EDBT 2020.
MV-PBT is a version-aware index structure for HTAP workloads, supporting index-only visibility-checks and flash-friendly I/O patterns.
Abstract:
Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause frequent creation of new versions, yielding a surge in index maintenance overhead. Secondly and more importantly, index-scans incur extra I/O overhead to determine, which of the resulting tuple-versions are visible to the executing transaction (visibility-check) as current designs only store version/timestamp information in the base table â not in the index. Such index-only visibility-check is critical for HTAP workloads on large datasets. In this paper we propose the Multi-Version Partitioned B-Tree (MV-PBT) as a version-aware index structure, supporting index- only visibility checks and flash-friendly I/O patterns. The ex- perimental evaluation indicates a 2x improvement for analytical queries and 15% higher transactional throughput under HTAP workloads. MV-PBT offers 40% higher tx. throughput compared to WiredTigerâs LSM-Tree implementation under YCSB.
New DBlab Member
The DBlab team is happy welcome Arthur Bernhardt on board!
Arthur will strengthen our neoDBMS-Team.New DFG Project Grant
neoDBMS: Hardware/Software Co-Design for Accelerated Near-Data Processing in Modern Database Systems
Principle Investigators: Embedded Systems and Applications Group, Technische Universitaet Darmstadt Data Management Lab, Reutlingen University Funding agency: DFG
neoDBMS aims to explore new architectures, abstractions and algorithms for intelligent database storage capable of performing Near-Data Processing (NDP)and executing data- or compute-intensive DBMS operations in-situ.
Abstract:
With advances in semiconductor technologies, it has nowadays become economical to produce combinations of modern semiconductor storage (e.g., Non-volatile Memories) and powerful compute-units (FPGA, GPU, many-core CPUs) co-located on, or close to, the same device - yielding intelligent storage devices. Data movements have become a limiting factor in times of exponential data growth, since they are blocking, frequent, and impair scalability. However, existing solution approaches are mainly based on 40-year old architectures, following the paradigm of {\em transporting} data to the processing elements. This procedure has both time as well as energy penalties. The ``memory wall'' and the "von Neumann bottleneck" amplify the negative performance impact of those deficiencies. The present project proposal aims to explore new architectures, abstractions and algorithms for intelligent database storage capable of performing Near-Data Processing (NDP). We target intelligent storage devices, comprising Non-volatile Memories or next-generation 3D-DRAM (such as the HMC), as well as the use of FPGAs as computational-units. We intend to investigate the following research questions: 1) Support for NDP in update-environments and hybrid-workloads. 2) Support for NDP in DBMS on Non-volatile Memories and NDP-support for declarative data layouts. 3) NDP use of shared virtual memory.
PIM Survey Published
T. Vincon, A. Koch, I. Petrov. Moving Processing to Data:On the Influence of Processing-in-Memory on Data Management. arXiv.
Near-Data Processing, ideally allows executing application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage.
Abstract:
Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and resource-efficiency. Processing-in-Memory is a sub-class of Near-Data processing that targets data processing directly within memory (DRAM) chips. The effective use of Near-Data Processing mandates new architectures, algorithms, interfaces, and development toolchains.
nativeNDP: Processing Big Data Analytics on Native Storage Nodes
T. Vincon, S. Hardock, C. Riegger, A. Koch, I. Petrov. nativeNDP: Processing Big Data Analytics on Native Storage Nodes. In Proc. ADBIS 2019.
We propose nativeNDP â a framework for Near-Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node.
Abstract:
Data analytics tasks on large datasets are computationally- intensive and often demand the compute power of cluster environments. Yet, data cleansing, preparation, dataset characterization and statistics or metrics computation steps are frequent. These are mostly performed ad hoc, in an explorative manner and mandate low response times. But, such steps are I/O intensive and typically very slow due to low data locality, inadequate interfaces and abstractions along the stack. These typically result in prohibitively expensive scans of the full dataset and transformations on interface boundaries. In this paper we examine R as analytical tool, managing large persis- tent datasets in Ceph, a wide-spread cluster file-system. We propose nativeNDP â a framework for Near-Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node. Across a range of data sizes, we show that na- tiveNDP is more than an order of magnitude faster than other pushdown alternatives.
Indexing large updatable Datasets in Multi-Version Database Management Systems
C. Riegger, T. Vinccon, I. Petrov. Indexing large updatable Datasets in Multi-Version Database Management Systems. In Proc. IDEAS 2019.
In this paper we present the implementation of Partitined B-Trees in PostgreSQL extended with SIAS.
Abstract:
Database Management Systems (DBMS) need to handle large updatable datasets under OLTP workloads. Most modern DBMS provide snapshots of data in MVCC transaction management scheme. Each transaction operates on a snapshot of the database. It is calculated from a set of tuple versions, containing logical transaction timestamps. This transaction management scheme enables high parallelism and resource-efficient append-only data placement on secondary storage. One major issue in indexing tuple versions on modern hardware technologies is the high write amplification for tree-indexes. Partitioned B-Trees (PBT) is based on the structure and algorithms of the ubiquitous B+-Tree. They achieve a near optimal write amplification and beneficial sequential writes on secondary storage. In this paper we present the implementation of PBTs in PostgreSQL extended with SIAS. Compared to PostgreSQL's standard B+-Trees PBTs have 50% better transactional throughput under TPC-C.
IPA-IDX: In-Place Appends for B-Tree Indices
S. Hardock, A. Koch, T. Vinccon, I. Petrov. IPA-IDX: In-Place Appends for B-Tree Indices. In Proc. DaMoN 2019.
IPA-IDX is an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records.
Paper Accepted at DaMoN 2019
S. Hardock, A. Koch, T. Vinccon, I. Petrov. IPA-IDX: In-Place Appends for B-Tree Indices. In Proc. DaMoN 2019.
Abstract:
We introduce IPA-IDX - an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records. IPA-IDX provides similar performance and longevity advantages for indexes as basic IPA does for tables. The selective application of IPA-IDX and basic IPA to certain regions and objects, lowers the GC overhead by over 60%, while keeping the total space overhead to 2%. The combined effect of IPA and IPA-IDX increases performance by 28%.
Native Storage Techniques for Data Management
I. Petrov, A. Koch, S. Hardock, T. Vincon, C. Riegger
In Proc. ICDE 2019
Native storage approaches, architectures and techniques for data processing and data management.
21.11.2019 Paper Accepted at ICDE 2019
I. Petrov, A. Koch, S. Hardock, T. Vincon. C. Riegger Native Storage Techniques for Data Management. In Proc. ICDE 2019.
Abstract:
In the present tutorial we perform a cross-cut analysis of database storage management from the perspective of modern storage technologies. We argue that neither the design of modern DBMS, nor the architecture of modern storage technologies are aligned with each other. Moreover, the majority of the systems rely on a complex multi-layer and compatibility-oriented storage stack. The result is needlessly suboptimal DBMS performance, inefficient utilization, or significant write amplification due to outdated abstractions and interfaces. In the present tutorial we focus on the concept of native storage, which is storage operated without intermediate abstraction layers over an open native storage interface and is directly controlled by the DBMS. We cover the following aspects of native storage: (i) architectural approaches and techniques; (ii) interfaces; (iii) storage abstractions; (iv) DBMS/system integration; (v) in-storage processing.
DBLab has open-sourced NoFTL, SIAS and cIPT
Check out DBLab's GitHub repository.
We have open-sourced
NoFTL,
SIAS, and
cIPT.
Clone, download, use ... and send us feedback.
New Project Grant
PANDAS: Programmable Appliance for Near Data Processing Accelerated Storage
Funding agency: BMBF
Principle Investigators: PRO DESIGN Electronic GmbH Xelera Technologies GmbH Embedded Systems and Applications Group, Technische Universitaet Darmstadt Data Management Lab, Reutlingen University
Efficient Data and Indexing Structure for Blockchains in Enterprise Systems
C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2018
read more ...
17.09.2018 Paper Accepted at iiWAS 2017
C. Riegger, T. Vincon, I. Petrov. Efficient Data and Indexing Structure for Blockchains in Enterprise Systems. In Proc. iiWAS 2018.
Abstract:
Blockchains yield to new workloads in database management systems and K/V-Stores. Distributed Ledger Technology (DLT) is a technique for managing transactions in âtrustlessâ distributed systems. Yet, clients of nodes in blockchain networks are backed by âtrustworthyâ K/V-Stores, like LevelDB or RocksDB in Ethereum, which are based on Log-Structured Merge Trees (LSM-Trees). However, LSM-Trees do not fully match the properties of blockchains and enterprise workloads. In this paper, we claim that Partitioned B-Trees (PBT) fit the proper- ties of this DLT: uniformly distributed hash keys, immutability, consensus, invalid blocks, unspent and off-chain transactions, reorganization and data state / version ordering in a distributed log-structure. PBT can locate records of newly inserted key-value pairs, as well as data of unspent transactions, in separate partitions in main memory. Once several blocks acquire consensus, PBTs evict a whole partition, which becomes immutable, to sec- ondary storage. This behavior minimizes write amplification and enables a beneficial sequential write pattern on modern hardware. Furthermore, DLT implicate some type of log-based versioning. PBTs can serve as MV-Store for data storage of logical blocks and indexing in multi-version concurrency control (MVCC) transaction processing.
Two entries in Encyclopedia of Big Data Technologies, Sakr, Sherif, Zomaya, Albert (Eds.), Springer
I. Petrov, T. Vincon, A. Koch, J. Oppermann, S. Hardock, C. Riegger. Active Storage
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.
I. Petrov, A. Koch, T. Vincon, S. Hardock, C. Riegger. Transaction Processing on NVM
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.
NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management
T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov.
In Proc. EDBT 2018
read more ...
22.12.2017 Paper Accepted at EDBT 2018
T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov. NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management. In Proc. EDBT 2018.
[PDF]
Abstract:
Modern persistent Key/Value stores are designed to meet the demand for high transactional throughput and high data-ingestion
rates. Still, they rely on backwards-compatible storage stack and abstractions to ease space management, foster seamless
proliferation and system integration. Their dependence on the traditional I/O stack has negative impact on performance,
causes unacceptably high write-amplification, and limits the storage longevity.
In the present paper we present NoFTL-KV, an approach that results in a lean I/O stack, integrating physical storage
management natively in the Key/Value store. NoFTL-KV eliminates backwards compatibility, allowing the Key/Value store
to directly consume the characteristics of modern storage technologies. NoFTL-KV is implemented under RocksDB.
The performance evaluation under LinkBench shows that NoFTL-KV improves transactional throughput by 33%, while
response times improve up to 2.3x. Furthermore, NoFTL-KV reduces write-amplification 19x and
improves storage longevity by imately the same factor.
Multi-Version Indexing and modern Hardware Technologies
A Survey of present Indexing Approaches
C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017
read more ...
02.10.2017 Paper Accepted at iiWAS 2017
C. Riegger, T. Vincon, I. Petrov. Multi-Version Indexing and modern Hardware Technologies - A Survey of present Indexing Approaches. In Proc. iiWAS 2017.
[PDF]
Abstract:
Characteristics of modern computing and storage technologies fundamentally differ from traditional hardware. There is a need to optimally leverage their performance, endurance and energy consumption characteristics. Therefore, existing architectures and algorithms in modern high performance database management systems have to be redesigned and advanced. Multi Version Concurrency Control (MVCC) approaches in data-base management systems maintain multiple physically independent tuple versions. Snapshot isolation approaches enable high parallelism and concurrency in workloads with almost serializable consistency level. Modern hardware technologies benefit from multi-version approaches. Indexing multi-version data on modern hardware is still an open research area. In this paper, we provide a survey of popular multi-version indexing approaches and an extended scope of high performance single-version approaches. An optimal multi-version index structure brings look-up efficiency of tuple versions, which are visible to transactions, and effort on index maintenance in balance for different workloads on modern hardware technologies.
Write-Optimized Indexing with Partitioned B-Trees
C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017
read more ...
02.10.2017 Paper Accepted at iiWAS 2017
C. Riegger, T. Vincon, I. Petrov. Write-Optimized Indexing with Partitioned B-Trees. In Proc. iiWAS 2017.
[PDF]
Abstract:
Database management systems (DBMS) are critical performance component in large scale applications under modern update-intensive workloads. Additional access paths accelerate look-up performance in DBMS for frequently queried attributes, but the required maintenance slows down update performance. The ubiquitous B + -Tree is a commonly used key-indexed access path that is able to support many required functionalities with logarithmic access time to requested records. Modern processing and storage technologies and their characteristics require reconsideration of matured indexing approaches for todayâs workloads. Partitioned B-Trees (PBT) leverage characteristics of modern hardware technologies and complex memory hierarchies as well as high update rates and changes in workloads by maintaining partitions within one single B + -Tree. This paper includes an experimental evaluation of PBTs optimized write pattern and performance improvements. With PBT transactional throughput under TPC-C increases 30%; PBT results in beneficial sequential write patterns even in presence of updates and maintenance operations.
SIAS-Chains: Snapshot Isolation Append Storage Chains
R. Gottstein, I. Petrov, S. Hardock, A. Buchmann
In Proc. ADMS@VLDB 2017
read more ...
27.8.2017 Paper Accepted at ADMS@VLDB 2017
R. Gottstein, I. Petrov, S. Hardock, A. Buchmann. SIAS-Chains: Snapshot Isolation Append Storage Chains. In Proc. ADMS@VLDB 2017.
[PDF]
Abstract:
Asymmetric read/write storage technologies such as Flash are becoming a dominant trend in modern database systems.They introduce hardware characteristics and properties which are fundamentally different from those of traditional storage technologies such as HDDs.
Multi-Versioning Database Management Systems (MV-DBMSs) and Log-based Storage Managers (LbSMs) are concepts that can effectively address the properties of these storage technologies but are designed for the characteristics of legacy hardware. A critical component of MV-DBMSs is the invalidation model. Transactional timestamps are assigned to the old and the new version, resulting in two independent (physical) update operations. Those entail multiple random writes as well as in-place updates, sub-optimal for new storage technologies both in terms of performance and endurance. Traditional page-append LbSM approaches alleviate random writes and immediate in-place updates, hence reducing the negative impact of Flash read/write asymmetry. Nevertheless, they entail significant mapping overhead, leading to write amplification.
In this work we present the Snapshot Isolation Append Storage Chains (SIAS-Chains) that employs a combination of multi-versioning with append storage management in tuple granularity and novel singly-linked (chain-like) version organization.
SIAS-Chains features simplified buffer management, multi-version indexing and introduces read/write optimizations to data placement on modern storage media. SIAS-Chains algorithmically avoids small in-place updates, caused by in-place invalidation and converts them into appends. Every modification operation is executed as an append and recently inserted tuple versions are co-located. SIAS-Chains is implemented in PostgreSQL and evaluated on modern Flash SSDs with standard update-intensive workload. The performance evaluation under PostgreSQL shows: (i) higher transactional throughput - up to 30 percent; (ii) significantly lower response times - up to 7 times lower; (iii) significant write reduction - up to 97 percent reduction; (iv) reduced space consumption and (v) higher tolerable workload.
Paper Accepted at ICDE 2017
read more ...
Selective In-Place Appends for Real: Reducing Erases on Wear-prone DBMS Storage
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. ICDE 2017 [PDF] [Video]
Abstract: In the present paper we demonstrate the novel technique to apply the recently proposed approach of In-Place Appends â overwrites on Flash without a prior erase operation. IPA can be applied selectively: only to DB-objects that have frequent and relatively small updates. To do so we couple IPA to the concept of NoFTL regions, allowing the DBA to place update-intensive DB-objects into special IPA-enabled regions. The decision about region configuration can be (semi-)automated by an advisor analyzing DB-log files in the background.
We showcase a Shore-MT based prototype of the above approach, operating on real Flash hardware. During the demon- stration we allow the users to interact with the system and gain hands-on experience under different demonstration scenarios.
Paper Accepted at SIGMOD 2017
read more ...
From In-Place Updates to In-Place Appends: Revisiting Out-of-Place Updates on Flash.
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. SIGMOD 2017 [PDF]
Abstract: Under update intensive workloads (TPC, LinkBench) small updates dominate the write behavior, e.g. 70% of all updates change less than 10 bytes across all TPC OLTP workloads. These are typically performed as in-place updates and result in random writes in page-granularity, causing major write-overhead on Flash storage, a write amplification of several hundred times and lower device longevity.
In this paper we propose an approach that transforms those small in-place updates into small update deltas that are appended to the original page. We utilize the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. Furthermore, we extend the traditional NSM page-layout with a delta-record area that can absorb those small updates. We propose a scheme to control the write behavior as well as the space allocation and sizing of database pages. We describe how the DBMS buffer and storage manager must be adapted to handle page operations.
The proposed approach has been implemented under Shore-MT and evaluated on real Flash hardware (OpenSSD) and a Flash emulator. Compared to In-Page Logging it performs up to 62% less reads and writes and up to 74% less erases on a range of workloads. The experimental evaluation indicates: (i) significant reduction of erase operations resulting in twice the longevity of Flash devices under update-intensive workloads; (ii) 15%-60% lower read/write I/O latencies; (iii) up to 45% higher transactional throughput; (iv) 2x to 3x reduction in overall write amplification.
Paper Accepted at EDBT 2017
read more ...
In-Place Appends for Real: DBMS Overwrites on Flash without Erase
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. In Proc. EDBT 2017 [PDF]
Abstract: Flash SSD is THE second tier storage for DBMS nowadays. Compared to HDDs, it is faster, consumes less power, produces less heat, and it is cheaper in terms of $/IOPS. Furthermore, the replacement of HDDs with SSDs is typically trivial since both kinds of storage devices utilize the same block-device interface and not seldom the same physical interfaces.
The recent research has shown that masking the management and the properties of native Flash memory using the black-box abstraction realized by the on-device Flash Translation Layer (FTL) significantly lowers the performance and endurance characteristics of Flash. The alternative is the utilization of open Flash interfaces. In this paper we follow this idea and propose an extension to it - the approach of In-Place Appends.
In the present paper we demonstrate a novel approach to handling small updates on Flash called In-Place Appends (IPA). It allows the DBMS to revisit the traditional write behavior on Flash. Instead of writing whole database pages upon an update in an out-of-place manner on Flash, we transform those small updates into update deltas and append them to a reserved area on the very same physical Flash page. In doing so we utilize the commonly ignored fact, that under certain conditions Flash memories can support in-place updates to Flash pages without a preceding erase operation.
\The approach was implemented under Shore-MT and evaluated on real hardware. Under standard update-intensive workloads we observed 67% less page invalidations resulting in 80% lower garbage collection overhead, which yields a 45% increase in transactional throughput, while doubling Flash longevity at the same time. The IPA outperforms In-Page Logging (IPL) by more than 50%.
We showcase a Shore-MT based prototype of the above approach, operating on real Flash hardware -- the OpenSSD Flash research platform. During the demonstration we allow the users to interact with the system and gain hands-on experience of its performance under different demonstration scenarios. These involve various workloads such as TPC-B, TPC-C or TATP.
Congrats Robert!
read more ...
DBlab congratulates Robert Gottstein on the occation of the successful defence of his docotral thesis entitled "Impact of new storage technologies on an OLTP DBMS, its architecture and algorithms". [PDF]
Abstract:
\New developments in hardware storage technology introduce fundamentally different performance characteristics and device properties. Storage technologies such as Flash and Non Volatile Memories (NVMs) are asymmetric in terms of their read and write performance, they read much faster than they write. Modern DBMSs are not aware of the underlying asymmetric storage technologies. They are well-developed systems and in principle capable of working with asymmetric storage technologies as a mere replacement, yet they fail to exploit their key properties. Huge performance potential is lying idle and durability of the storage media is shortened which ultimately leads to higher costs. This work is a remedy for those shortcomings, making the DBMS aware of the underlying asymmetric Flash storage and questioning existing multi-version DBMS (MV-DBMS) architecture, algorithms and optimizations. We exploit the performance potential of the asymmetric Flash storage and increase its durability. A re-evaluation and redesign of components within the DBMS is necessary, inevitably leading to a redesign of the whole DBMS. Without such a redesign, the DBMS software stack will become the new I/O bottleneck. The combination of the MV-DBMS, multi-versioning concurrency control (MVCC) and append/log-based storage management (LbSM) on Flash storage delivers optimal performance ï¬gures which are needed to satisfy the urgent demand in scalable performance for modern DBMSs.
mhp-Award for best Bachalorâs Project WS14/15
read more ...
Congratulations!
The students Felix Heldmaier, Florian Grötzner, Niels Shuchmacher, Samuel Sailer, Steffen Höser, Florian Hofstädter, Yannik Scheible, Yu-Ninig Wang (Exchange Student NCTU, Taiwan), Jiawei Liu (Exchange Student Donghua-University, Shangai, China) won the mhp-Award for their Bachalorâs Project âPerformance analysis of search algorithms for database indicesâ.
Project Dexription (in German):
Ziel dieses Projektes ist es, experimentell das Leistungsverhalten von Datenbank-Indizes hinsichtlich unterschiedlich verteilter Daten zu untersuchen. Die Arbeitslast besteht aus klassischen Datenbankanfragen wie z.B. Punkt- und Intervalabfragen.
Im Fokus stand, den Einfluss der Datenverteilungen auf die Indexleistung zu bestimmen. Dafür wurden Datengeneratoren entworfen, die eine variabel groÃe Datenmenge nach neuen vordefinierten statistischen Verteilungen generieren: Weibul, Cosine, Cauchy, Normal, Lognormal, Exponential, Doublelog, Parabolic, Extremevalue. Dadurch wurden reale Daten in Datenbanksystemen approximiert. Darüber hinaus wurden statistische Testmethoden (z.B. Chi2-Test) umgesetzt, um die tatsächliche Verteilung der generierten Daten zu validieren.
Die so erstellten Daten wurden in mehreren Datenbanksystemen abgelegt und darauf wurden Indizes erzeugt. Das Index-Leistungsverhalten wurde durch eigens entworfene Micro-Benchmarks gemessen. Das Datenbanksystem wurde in einen Initialzustand (Rampup-Phase) versetzt, um nachfolgend eine Reihe von verschiedenen Datenbankabfragen wiederholt auszuführen und deren Antwortszeiten zu messen. Die Messergebnisse werden grafisch dargestellt.
Der gesamte Messstand, bestehend aus Datengeneratoren, Benchmarks und Ergebnisdarstellung hat sowohl eine graphische Benutzeroberfläche für den persönlichen Gebrauch als auch eine Kommandozeilebedienung für den Servereinsatz.
Paper accepted at EDBT 2015
read more ...
NoFTL for Real: Databases on Real Native Flash Storage. In Proc. 18th International Conference on Extending Database Technology, Brussels, Belgium (EDBT 2015). [PDF] [Video]
Abstract:
Flash SSDs are omnipresent as database storage. HDD re- placement is seamless since Flash SSDs implement the same legacy hardware and software interfaces to enable backward compatibility. Yet, the price paid is high as backward com- patibility masks the native behaviour, incurs significant com- plexity and decreases I/O performance, making it non-robust and unpredictable. Flash SSDs are black-boxes. Although DBMS have ample mechanisms to control hardware directly and utilize the performance potential of Flash memory, the legacy interfaces and black-box architecture of Flash devices prevent them from doing so.
In this paper we demonstrate NoFTL, an approach that enables native Flash access and integrates parts of the Flash- management functionality into the DBMS yielding signif- icant performance increase and simplification of the I/O stack. NoFTL is implemented on real hardware based on the OpenSSD research platform. The contributions of this paper include: (i) a description of the NoFTL native Flash storage architecture; (ii) its integration in Shore-MT and (iii) performance evaluation of NoFTL on a real Flash SSD and on an on-line data-driven Flash emulator under TPC- B,C,E and H workloads. The performance evaluation results indicate an improvement of at least 2.4x on real hardware over conventional Flash storage; as well as better utilisation of native Flash parallelism.
Associated Members
read more ...
DBlab welcomes Robert Gottstein and Sergey Hardock as associated members. Their research interests are in the field of data management on modern storage technolgies. Both of them are affiliated with the databases and distributed systems group (DVS) at Technische Universtät Darmstadt.
Associated Member of DFG GK 1994 AIPHES
read more ...
Ilia Petrov is appointed an associated member of DFG GK 1994 AIPHES:Adaptive Preparation of Information from Heterogeneous Sources.
He is responsible for database integration and high-performance data processing.
Ph.D. Scholarship
read more ...
DBlab participtates with one Ph.D. in the newly appointed graduate school "Services Computing" in coopertaion with Universität Stuttgart. The research direction is in the field of BigData and high-performance data management and analytics.
Paper Accepted at EDBT 2016
read more ...
Revisiting DBMS Space Management for Native Flash.
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. EDBT 2016.
Paper accepted at iiWAS 2015
read more ...
Real Time Charging Database Benchmarking.
J. Bogner, C. Dehner, T. Vincon, I. Petrov.
In Proc. iiWAS 2015.
Best Paper Award at DBKDA 2015
read more ...
Best paper award for
Tim Lessner, Fritz Laux
O|R|P|E - A Data Semantics Driven Concurrency Control Mechanism
In Proceedings DBKDA 2015 - The Seventh International Conference on Advances in Databases, Knowledge, and Data Applications, pp 147-152, May 24-29, 2015 - Rome, Italy
Paper Accepted at ICDE 2015
read more ...
27.11.2014 Paper Accepted at ICDE 2015
I. Petrov, R. Gottstein, S. Hardock. DBMS on Modern Storage Hardware. In Proc. International Conference on Data Engineering (ICDE) 2015.
[Slides]
Abstract:
In the present tutorial we perform a cross-cut analysis of database systems from the perspective of modern storage technology, namely Flash memory. We argue that neither the design of modern DBMS, nor the architecture of Flash storage technologies are aligned with each other. The result is needlessly suboptimal DBMS performance and inefficient Flash utilisation as well as low Flash storage endurance and reliability.
We showcase new DBMS approaches with improved algorithms and leaner architectures, designed to leverage the properties of modern storage technologies. We cover the area of transaction management and multi-versioning, putting a special emphasis on: (i) version organisation models and invalidation mechanisms in multi-versioning DBMS; (ii) Flash storage management especially on append-based storage in tuple granularity; (iii) Flash-friendly buffer management; as well as (iv) improvements in the searching and indexing models.
Furthermore, we present our NoFTL approach to native Flash access that integrates parts of the Flash-management functionality into the DBMS yielding significant performance increase and simplification of the I/O stack. In addition, we cover the basics of building large Flash storage for DBMS and revisit some of the RAID techniques and principles.
DBlab Talk
read more ...
Ilia Petrov ist giving a talk "Advances in Flashing the Database Storage" hosted by the GI local group Stuttgart/Böblingen. The event takes place on Mo, 6.Oct.2014, 18:15 â 20:00 at Uni-Stuttgart.
Tim Lessner completed PhD
read more ...
Tim Lessner was awarded Doctor of Philosophy (Ph.D.)
Tim Lessner was awarded Doctor of Philosophy (Ph.D.) from the University of the West of Scotland, Paisley, in collaboration with Reutlingen University. The title of his thesis is "O|R|P|E - A High Performance Semantic Transaction Model for Disconnected Systems". From the Abstract:"The thesis studies concurrency control and composition of transactions in computing environments for long living transactions where local data autonomy is indispensable".
Contratulations, Dr. Tim!
Best Paper Award at DBKDA 2014
read more ...
Best paper award for
Michael Schaidnagel, Fritz Laux
Feature Construction for Time Ordered Data Sequences
In Proceedings DBKDA 2014 - The Sixth International Conference on Advances in Databases, Knowledge, and Data Applications, pp 1-6, April 20-24, 2014 - Chamonix, France
PC Memberships
read more ...
Members of the DBlab are invited to server serve on the programme comittees of DATA 2014, INTERNET 2014 and WETICE 2014
Paper Accepted at EDBT 2014
read more ...
The paper "SIAS-V in Action: Snapshot Isolation Append Storage - Vectors on Flash" by Robert Gottstein, Thorsten Peter, Ilia Petrov and Alejandro Buchmann has been accepted for publication at the "17th International Conference on Extending Database Technology" (EDBT 2014 - Demonstrations Track), held in Athens, Greece on March 24-28, 2014. [Demonstration Video]
Abstract:
Multi-Version Database Management Systems (MV-DBMS) are wide-spread and can effectively address the characteris- tics of new storage technologies such as Flash, yet they are mainly optimized for traditional storage. A modification of a tuple in a MV-DBMS results in a new version of that item and the invalidation of the old version. Under Snapshot Isolation (SI) the invalidation is performed as an in-place update, which is suboptimal for Flash. We introduce Snap- shot Isolation Append Storage â Vectors (SIAS-V), which avoids the invalidation related updates by organising tuple versions as a simple linked list and by utilizing bitmap vec- tors representing different states of a single version. SIAS-V sequentializes writes and reduces the write-overhead by ap- pending in tuple-version granularity, writing out only com- pletely filled pages, and eliminating in-place invalidation.
In this demonstration we showcase the SIAS-V imple- mentation in PostgreSQL side-to-side with SI. Firstly, we demonstrate that the I/O distribution of PostgreSQL un- der a TPC-C style workload, exhibits a dominant small- sequential write pattern for SIAS-V, as opposed to a ran- dom write dominated pattern under SI. Secondly, we demon- strate how the dense packing of tuple-versions on pages un- der SIAS-V reduces significantly the amount of data written. Thirdly, we show that SIAS-V yields to stable write per- formance and low transaction response times under mixed loads. Last but not least, we demonstrate that SIAS-V also provides performance improvements for traditional HDDs.
Book Chapter to Appear
read more ...
Khalid Nawaz, Ilia Petrov, Alejandro Buchmann. Configurable, Energy-Efficient, Application- and Channel-aware Middleware Approaches for Cyber-Physical Systems. In Zeashan Khan, A. B. M. Shawkat Ali, Zahid Riaz: Computational Intelligence for Decision Support in Cyber-Physical Systems, Studies in Computational Intelligence 540, ISBN 978-981-4585-35-4, Springer , July 2014
Background:
Cyber-Physical Systems represent a class of systems that are composed of computing devices that monitor and control the real world physical processes. The monitoring task requires these devices to be equipped with sensing elements which provide the primary input in the form of raw data to the computing elements of the system. The output of the computing element is generally channeled to the actuation elements of the system for the desired actuation to take place. The actuation serves as the controlling mechanism for the monitored physical process and also closes the sensing-processing-actuation loop. The interaction of these systems with the physical processes introduces some challenges regarding the physical characteristics such as shape, size and robustness of the devices in addition to the more challenging problem of impedance mismatch between the inherently concurrent physical processes and inherently sequential computing processes. In order for these systems to perform monitoring and control functions on the physical processes, networking of the computing elements, generally on an ad hoc basis, is also necessary. In a nutshell, cyber-physical systems are composed of networked embedded computing elements that are equipped with sensing and actuation capabilities so that they can monitor and control physical real-world processes. In Illustration 1 we show a simplified schematic of a typical cyber-physical system indicating the information flows between sensing, processing and actuation parts. It also shows two optional user interface elements, one is used for configuring the devices and the other one to present end users with their desired information. Such user interface components mostly find use in intelligent industrial automation systems.
DBlab Vortrag
read more ...
Dr. Christoph P. Neumann, (EXASOL AG). EXASolution, ein analytisches Datenbanksystem, 100% Made in Germany. Wann: am 21.01.2014 um 9:45Uhr, Raum 9-005
Abstract: Big Data ist in aller Munde und viele Unternehmen ertrinken geradezu in den angesammelten Daten. Doch um der Daten Herr zu werden benötigt man das richtige Werkzeug. EXASOL bietet mit seinem Produkt EXASolution ein massiv-paralleles In-Memory Datenbank Management System (DBMS) zur Spalten-orientierten Verwaltung und SQL-basierten und transaktional geschützten Verarbeitung sehr groÃer relationaler Datenmengen an. In-Database-Processing erlaubt darüber hinaus auch unstrukturierte oder semi-strukturierte Daten massiv-parallel im DBMS-Cluster zu verarbeiten. Dieser Vortrag wird Einblicke in die Arbeit des deutschen Datenbankherstellers EXASOL und deren Hochleistungsdatenbanksystem EXASolution liefern.
Short Bio:
Herr Dr. Christoph P. Neumann ist technischer Berater bei der EXASOL AG in Nürnberg. Er studierte an der Friedrich-Alexander-Universität Erlangen-Nürnberg und erhielt 2005 sein Diplom in Informatik. Im Anschluss arbeitete er zwei Jahre lang als Software-Ingenieur bei der Capgemini sd[&]m AG in München. Er kehrte 2007 an die Friedrich-Alexander-Universität zurück und provomierte bei Prof. Dr. Richard Lenz am Lehrstuhl für Informatik 6 (Datenmanagement). In Rahmen seiner dortigen Lehrtätigkeiten erhielt er zwei Auszeichnungen für exzellente Lehre. Seine Forschungsinteressen beinhalten adaptiv-evolutionäre Informationssysteme, Prozessmanagement und agile Prozessplanung sowie Methoden der Systemintegration und der verteilten Datensynchronisierung. Seine Promotion erfolgte im November 2012 über verteilte Fallakten im Gesundheitswesen und erhielt die Bestnote "mit Auszeichnung bestanden"; darüber hinaus wurde seine Dissertation für den "GI-Dissertationspreis 2012" nominiert. Seit März 2013 arbeitet er bei der EXASOL, einem IT-Unternehmen mit Sitz in Nürnberg, das die nachweisbar schnellste relationale Datenbank weltweit für analytische Anwendungen anbietet. In seiner Position als technischer Berater verantwortet er die Durchführung von Proof-of-Concept-Projekten mit Kunden in der Presales-Phase.
Paper Accepted at IJAS
read more ...
Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Multi-Version Databases on Flash: Append Storage and Access Paths. In International Journal On Advances in Software, Vol. 6, Number 3 and 4 2013.
DBlab Talk
read more ...
Götz Graefe Ph.D., HP Fellow, Hewlett-Packard Laboratories will give a talk "Instant Recovery From System Failures" on 28.10.2013 at 13:00 in Room 9-039
Title:
Instant Recovery From System Failures
Who:
Götz Graefe, Ph.D., HP Fellow, Hewlett-Packard Laboratories
When:
on 28.10.2013 at 13:00Uhr in room 9-039
Abstract:
Database system failures and the subsequent recovery disrupt many transactions and entire applications, almost always for an extended duration. For those failures, new on-demand "instant" recovery techniques reduce application downtime from minutes or hours to seconds. These new recovery techniques work for databases, file systems, key-value stores, and all other data stores that employ write-ahead logging. Most of the required techniques already exist in many transactional information systems.
Short CV:
Götz Graefe's contributions to database research and product development include query optimization in the Exodus research effort and in the Tandem SQL/MX product, query execution in the Volcano research prototype, and query processing in Microsoft's SQL Server product. In addition to query processing, his work has covered indexing, in particular novel techniques for B-trees, robust performance in query processing, for example a new integrated join algorithm, and transaction support, for example a new scheme for key-range locking. One of his current work streams focuses on database utilities, for example faster backup, restore, and recovery.
Paper Accepted at IDEAS 2013
read more ...
G. Graefe, I. Petrov, T. Ivanov , Veselin Marinov. A hybrid page layout integrating PAX and NSM. In Proc. IDEAS 2013.
Abstract:
Prior work on in-page record formats has contrasted the âN-ary storage modelâ (NSM) and the âpartition attributes acrossâ (PAX) format. The former is the traditional standard page layout whereas the latter âexhibits superior cache and memory bandwidth utilizationâ, e.g., in data warehouse queries with large scans. Unfortunately, space management within each page is more complex due to the mini-pages in the PAX layout. In contrast, the NSM format simply grows a slot array and the data space from opposite ends of the page until all space is occupied.
The present paper explores a hybrid page layout (HPL) that aims to combine the advantages of NSM and PAX. Predicate evaluation in large scan queries have the same number of cache faults as PAX, and space management uses two data areas growing towards each other. Moreover, the design defines a continuum between NSM and PAX in order to support both efficient scans and efficient insertions and updates. This design is equally applicable to cache lines within RAM memory (the original design goal of PAX) and to small pages on flash storage within large disk pages.
Our experimental evaluation is based on a ShoreMT implementation. It demonstrates that the HPL design scans almost as fast as the scan-optimized PAX layout and updates almost as fast as the update-optimized NSM layout, i.e., it is competitive with both in their best use cases.
Paper Accepted at IDEAS 2013
read more ...
R. Gottstein, I. Petrov, A. Buchmann. Read Optimisations for Append Storage on Flash. In Proc. IDEAS 2013.
Abstract:
Append-/Log-based Storage Managers (LbSM) for database systems represent a good match for the characteristics and behaviour of Flash technology. LbSM alleviate random writes reducing the impact of Flash read/write asymmetry, increas- ing endurance and performance. A recently proposed combi- nation of Multi-Versioning database approaches and LbSM called SIAS [9] offers further benefits: it substantially lowers the write rate due to tuple version append granularity and therefore improves the performance. In SIAS a page con- tains versions of tuples of the same table. Once appended such a page is immutable. The only allowable operations are reads (lookups, scans, version visibility checks) in tuple version granularity. Optimising for them offers an essential performance increase. In the present work-in-progress paper we propose two types of read optimisations: Multi-Version Index and Ordered Log Storage.
Benefits of Ordered Log Storage: (i) Read efficiency due to the use of parallel read streams; (ii) Write efficiency since larger amounts of data are appended sequentially; (iii) fast garbage collection: read multiple sorted runs, filter dead tuples and write one single, large (combined) sorted run. (iv) possible cache-efficiency optimisations (for large scans)
Benefits of Multi-Version Indexing: (i) index only visi- bility checks; (ii) postponing of index reorganisations; (iii) no invalid tuple bits in the index (in-place updates); (iv) pre-filtering of invisible tuple versions; (v) facilitate easy identification of tuple versions to be garbage collected.
Benefits of the combination of both approaches: (i) Index and ordered access; (ii) Facilitate range searches in sorted runs; (iii) on the fly garbage collection (checking of one bit).
Paper Accepted ad ADMS@VLDB13
read more ...
P. Dubs, I. Petrov, R. Gottstein, A. Buchmann. FBARC: I/O Asymmetry Aware Buffer Replacement Strategy. In Proc. ADMS 2013, in Conjunction with VLDB 2013
Abstract:
Buffer Management is central to database systems; it minimizes the access gap between memory and disk. Primary criterion of most buffer management strategies is hitrate maximization (based on recency, frequency). New storage technologies exhibit characteristics such as read/write asymmetry and low read latency. These have significant impact on the buffer manager: due to asymmetry the cost of page eviction may be several times higher than the cost of fetching a page. Hence buffer management strategies for modern storage technologies must consider write-awareness and spatial locality besides hitrate.
In this paper we introduce FBARC - a buffer management strategy designed to address I/O asymmetry on Flash devices. FBARC is based on ARC and extends it by a write list utilizing the spatial locality of evicted pages to produce
semi-sequential write patterns. FBARC adds an additional list to host dirty pages grouping them into fixed regions called clusters based on their disk location. In comparison to LRU, CFLRU, CFDC, and FOR+, FBARC: (i) addresses write-efficiency and endurance; (ii) offers comparatively high hitrate; (iii) is computationally-efficient and uses static grid-based clustering of the page eviction list; (iv) adapts to workload changes; (v) is scan-resistant. Our experimental evaluation compares FBARC against LRU, CFLRU, CFDC, and FOR+ using trace-driven simulation, based on standard benchmark traces (e.g. TPC-C, TPC-H).
PC Memberships
read more ...
Members of the DBlab serve on the program committee of CISIS-2014. (Parallel Computing Track)
Paper Accepted ad GMDS13
read more ...
C. Thies, I. Petrov. Beitrag Hospital Information Systems on High Performance and Energy Efficient Database Systems. In Proc. GMDS 2013. 58. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Lübeck, 01.-05.09.2013. Düsseldorf: German Medical Science GMS Publishing House; 2013.
Current Hospital Information Systems (HIS) face operational challenges such as long downtimes/data inconsistency during system updates, high energy costs, system slow-down during periods of high load and incomplete failover solutions with lots of costly and error prone manual post-processing. A bottleneck for all of these problems is accessing the HIS database and its consistency. To overcome these problems the software architecture (SWA) of HIS currently only uses DBMS specific features. Considering the most frequent types of transactions from a HIS the database itself typically has a monolithic design. Here a higher integration of hardware, DBMS and HIS architecture will increase efficiency, reliability and decrease HIS downtime. This is achievable by using recent development of new memory technologies and affordable hardware as well as a systematic overview of sophisticated database concepts and their specific integration in HIS architectures. This work describes the impact of such new concepts on the main challenges in HIS data management and HIS SWA.
Paper Accepted at SoEA4EE'2013
read more ...
A. Zimmermann, M. Pretz, G. Zimmermann, D. Firesmith, I. Petrov, E. El-Sheikh. Towards Service-oriented Enterprise Architectures for Big Data Applications in the Cloud. In Proc. SoEA4EE'2013 in conjunction with EDOC 2013
Paper Accepted at IJDWM
read more ...
G. Graefe, A. Nica, K. Stolze, T. Neumann, T. Eavis, I. Petrov, E. Pourabbas, D. Fekete. Elasticity in Cloud Databases and Their Query Processing. In International Journal of Data Warehousing and Mining (IJDWM) 9.2 (2013)
ABSTRACT A central promise of cloud services is elastic, on-demand provisioning. For data-intensive services such as data management, growing and shrinking the set of nodes implies copying data to nodes with temporary membership in a service. The provisioning of data on temporarily available nodes is what makes elastic database services a hard problem. At best, a node might retain (not destroy) its copy of the data while it provides another service; at worst, a node that rejoins the database service (or joins for the first time, or joins after a prior failure) requires a new copy of all its assigned data. The essential task that enables elastic data services is bringing a node and its data up-to-date. Strategies for high availability do not satisfy the need in this context because they bring nodes online and up-to-date by repeating history, e.g., by log shipping. We believe that nodes should become up-to-date and useful for query processing incrementally by key range. What is wanted is a technique such that in a newly added node, during each short period of time, an additional small key range becomes up-to-date, until eventually the entire dataset becomes up-to-date and useful for query processing, with overall update performance comparable to a traditional high-availability strategy that carries the entire dataset forward without regard to key ranges. Even without the entire dataset being available, the node is productive and participates in query processing tasks. Our proposed solution relies on techniques from partitioned B-trees, adaptive merging, deferred maintenance of secondary indexes and of materialized views, and query optimization using materialized views. The paper introduces a family of maintenance strategies for temporarily available copies, the space of possible query execution plans and their cost functions, and appropriate query optimization techniques.
Paper Accepted at Data Analytics
read more ...
M. Schaidnagel, I. Petrov, F. Laux. DNA: An Online Algorithm for Credit Card Fraud Detection for Game Merchants. In Proc. Data Analytics 2013
AbstractâOnline credit card fraud represents a significant challenge to online merchants. In 2011 alone, the total loss due to credit card fraud amounted to $ 7.60 billion with a clear upward trend. Especially online games merchants have difficulties applying standard fraud detection algorithms to achieve timely and accurate detection. The present paper introduces a novel approach for online fraud detection, called DNA. It is based on a formula which uses attributes that are derived from a sequence of transactions. The influence of these attributes on the result of the formula reveals additional information about this sequence. The result represents a fraud level indicator, serving as a classification threshold. A systematic approach for finding these attributes and the mode of operation of the algorithm is given in detail. The experimental evaluation against several standard algorithms on a real life data set demonstrates the superior fraud detection performance of the DNA approach (16.25% better fraud detection accuracy, 99.59% precision and low response time). In addition to that, several experiments were conducted in order to show the good scalability of the suggested algorithm.
08.08.2013 DFG Flashy-DB Extended
read more ...
The DFG (Deutsche Forschungsgemeinschaft) has extended FlashyDB the research project. DFG FlashyDB is aiming to investigate the influence of Flash Memory on the database architecture, performance and algorithms.
06.08.2013 Program Committee Memberships
read more ...
Members of the DBlab serve on the program committee of SIGMOD 2014 (Demo Track).
DBlab Talk
read more ...
Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland will give talk 'Managing Large Data Volumes Efficiently with IBM Netezza' on 18.06.2013 at 11:30 in 9-003.
Title:
Managing Large Data Volumes Efficiently with IBM Netezza
Who:
Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland
When:
am Di. 18.06.2013 um 11:30Uhr | Raum 9-003
Abstrakt [PDF]:
Netezza is a highly specialized database management system for data warehousing operations. In this presentation, Dr. Knut Stolze gives an overview of the its system architecture and the internal query processing. It is shown how very good performance can be delivered with a very simple user interface that avoids indexes. Next, the IBM DB2 Analytics Accelerator is an integration project and commercial product that combines the strengths of Netezza's analytic query processing capabilities with DB2's superior OLTP performance. Knut highlights how the integration of both products is achieved in a (nearly) seamless way.
Zum Vortragenden:
Dr. Knut Stolze is working for the Information Management department in the IBM Research [&] Development Lab in Böblingen, Germany. He focuses on relational database systems, specifically large scale data warehouse systems. He gained his expertise and experience in academic and industrial research and in product development. His current research efforts focus on enterprise data warehouse systems, in particular technologies like in-memory, specialty hardware for high performance query processing, and database federation. Knut Stolze is a senior software developer and master inventor at IBM. In his role as an architect, he is responsible for the design and implementation of the IBM DB2 Analytics Accelerator for z/OS. Prior to the current project, Dr. Stolze worked in the DB2 Spatial Extender development team, earned his PhD at the University of Jena, Germany, in 2006, and subsequently moved on the the DB2 z/OS Utilities development.
Accepted at VLDB 2013
read more ...
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. NoFTL: Database Systems on FTL-less Flash Storage. VLDB 2013 (Demonstrations Track). Riva del Garda, August 26-31, 2013. [Demonstration Video]
Abstract
The database architecture and workhorse algorithms have been designed to compensate for hard disk properties. The I/O characteristics of Flash memories have significant impact on database systems and many algorithms and approaches taking advantage of those have been proposed recently. Nonetheless on system level Flash storage devices are still treated as HDD compatible block devices, black boxes and fast HDD replacements. This backwards compatibility (both software and hardware) masks the native behaviour, incurs significant complexity and decreases I/O performance, making it non-robust and unpredictable. Database systems have a long tradition of operating directly on RAW storage natively, utilising the physical characteristics of storage media to improve performance.
In this paper we demonstrate an approach called NoFTL that goes a step further. We show that allowing for native Flash access and integrating parts of the FTL functionality into the database system yields significant performance increase and simplification of the I/O stack. We created a real-time data-driven Flash emulator and integrated it accordingly into Shore-MT. We demonstrate a performance improvement of up to 3.7x compared to Shore-MT on RAW block-device Flash storage under various TPC workloads.
Best Paper Awards
read more ...
Papers co-authored by members of the DBlab have been given Best Paper Awards:
Christian Abele, Michael Schaidnagel, Fritz Laux, Ilia Petrov. Sales Prediction with Parametrized Time Series Analysis. DBKDA 2013.
Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Aspects of Append-Based Database Storage Management on Flash Memories. DBKDA 2013
Accepted Paper
read more ...
R. Gottstein, I. Petrov, and A. Buchmann. Append storage in multi-version databases on flash. In Proc. of BNCOD 2013. Springer-Verlag, 2013.
DBKDA Papers
read more ...
El-Sheikh, E., Bagui, S., Firesmith, D.G., Petrov, I., Wilde, N., Zimmermann, A.: Towards Semantic-Supported SmartLife System Architectures for Big Data Services in the Cloud. In Proc. Service Computation'13, (2013)
PC Memberships
read more ...
iiWAS2013 and ACM PIKM 2013
Members of the DBlab serve on the program comittees of iiWAS2013 and PIKM 2013, at the ACM CIKM 2013.
DBlab Technical Report
read more ...
G. Graefe, I. Petrov, T. Ivanov, V. Marinov. A hybrid page layout integrating PAX and NSM. Technical Report (HPL-2012-240). 2012
A technical report (HPL-2012-240) entitled 'A hybrid page layout integrating PAX and NSM' has been published as a cooperation between Hewlett-Packard Laboratories, DBlab, Reutlingen University, DVS, Technical University Darmstadt.
The report available online under: http://www.hpl.hp.com/techreports/2012/HPL-2012-240.html
DBKDA Papers
read more ...
DBKDA Paper Accepted
Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Aspects of Append-Based Database Storage Management on Flash Memories.In Proc.DBKDA 2013.
DBlab Talk
read more ...
Robert Gottstein (Databases and Distributed Systems Group, TU-Darmstadt) will give talk on the influence of new storage technologies on database systems.
Title: Data Intensive Systems on New Storage Technologies[pdf]
When: 13.12.2012 at 13:00
Where: 9-108.
Abstract: [pdf]
As new storage technologies with radically different properties are appearing (Flash and Non-Volatile Memories), a substantial architectural redesign is required if they are to be used efficiently in a high-performance data-intensive system.
\Multi-Version approaches to database systems (MVCC, SI) are gaining significant importance and become a dominating trend. They not only offer characteristics that meet the requirements of enterprise workloads, but also provide concepts that can effectively address the properties of new storage technologies. Yet version management may produce unnecessary random writes which are suboptimal for the new technologies.
A variant of SI called SI-CV collocates tuple versions, created by a transaction, in adjacent blocks and minimizes random writes at the cost of random reads. Its performance, relative to the original algorithm, in overloaded systems under heavy transactional loads in TPC-C scenarios on Flash SSD storage increases significantly. At high loads that bring the original system into overload, the transactional throughput of SI-CV increases further, while maintaining response times that are multiple factors lower. SI produces a new version of a data item once it is modified. Both the new and the old version are timestamped accordingly, which in many cases results in two independent (physical) update operations, entailing multiple random writes as well as in-place updates. These are also suboptimal for new storage technologies both in terms of performance and endurance.
We claim that the combination of multiversioning and append storage effectively addresses the characteristics of modern storage technologies.Snapshot Isolation Append Storage (SIAS) improves on SI and traditional "page granularity" append based storage managers. It manages versions as simply linked lists (chains) that are adressed by using a virtual tuple ID (VID). In SIAS the creation of a new version implicitly invalidates the old one resulting in an out-of-place write implemented as a logical append eliminating the need for invalidation timestamps. SIAS is coupled to an append-based storage manager, appending units of tuple versions. SIAS indicates up to 4x performance improvement on Flash SSD under TPC-C workload, entailed by a significant write overhead reduction (up to 38x). SIAS achieves better space utilization due to denser version packing per page and allows for better I/O parallelism and up to 4x lower disk I/O execution times. SIAS aids better endurance, due to use of out-of-place writes as appends and write overhead reduction. Compared to traditional page granularity appends, SIAS achives up to 85% higher read throughput and up to 38x write reduction.
DBKDA Paper
read more ...
DBKDA Paper Accepted
DBKDA Paper Accepted
Christian Abele, Michael Schaidnagel, Fritz Laux, Ilia Petrov. Sales Prediction with Parametrized Time Series Analysis. In Proc. DBKDA 2013.
Data Mining Cup 2012
read more ...
27.06.2012
Michael Scheidnagel and Christian Abele students in the Data Managment Lab earn 7th place in the overall ranking for the second assignment of the Data Mining Cup 2012.
DBlab Web Page
read more ...
27.06.2012
19.10.2012
DBlab Web-Page created.