## News

### New Project Grant

pimDB: infrastructure for Processing-In-Memory in modern DBMS

Principle Investigators:
Data Management Lab

pimDB provides infrastructures for PIM research in modern main-memory DBMS.

### New Paper Accepted

C. Riegger, T. Vincon, R. Gottstein, I. Petrov. MV-PBT: Multi-Version Indexing for Large Datasets and HTAP Workloads. In Proc. EDBT 2020.

MV-PBT is a version-aware index structure for HTAP workloads, supporting index-only visibility-checks and flash-friendly I/O patterns.

Abstract:

Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause frequent creation of new versions, yielding a surge in index maintenance overhead. Secondly and more importantly, index-scans incur extra I/O overhead to determine, which of the resulting tuple-versions are visible to the executing transaction (visibility-check) as current designs only store version/timestamp information in the base table â not in the index. Such index-only visibility-check is critical for HTAP workloads on large datasets. In this paper we propose the Multi-Version Partitioned B-Tree (MV-PBT) as a version-aware index structure, supporting index- only visibility checks and flash-friendly I/O patterns. The ex- perimental evaluation indicates a 2x improvement for analytical queries and 15% higher transactional throughput under HTAP workloads. MV-PBT offers 40% higher tx. throughput compared to WiredTigerâs LSM-Tree implementation under YCSB.

### New DBlab Member

The DBlab team is happy welcome Arthur Bernhardt on board!

Arthur will strengthen our neoDBMS-Team.

### New DFG Project Grant

neoDBMS: Hardware/Software Co-Design for Accelerated Near-Data Processing in Modern Database Systems

Principle Investigators:
Embedded Systems and Applications Group,

Data Management Lab, Reutlingen University
Funding agency: DFG

neoDBMS aims to explore new architectures, abstractions and algorithms for intelligent database storage capable of performing Near-Data Processing (NDP)and executing data- or compute-intensive DBMS operations in-situ.

Abstract:

With advances in semiconductor technologies, it has nowadays become economical to produce combinations of modern semiconductor storage (e.g., Non-volatile Memories) and powerful compute-units (FPGA, GPU, many-core CPUs) co-located on, or close to, the same device - yielding intelligent storage devices. Data movements have become a limiting factor in times of exponential data growth, since they are blocking, frequent, and impair scalability. However, existing solution approaches are mainly based on 40-year old architectures, following the paradigm of {\em transporting} data to the processing elements. This procedure has both time as well as energy penalties. The memory wall'' and the "von Neumann bottleneck" amplify the negative performance impact of those deficiencies. The present project proposal aims to explore new architectures, abstractions and algorithms for intelligent database storage capable of performing Near-Data Processing (NDP). We target intelligent storage devices, comprising Non-volatile Memories or next-generation 3D-DRAM (such as the HMC), as well as the use of FPGAs as computational-units. We intend to investigate the following research questions: 1) Support for NDP in update-environments and hybrid-workloads. 2) Support for NDP in DBMS on Non-volatile Memories and NDP-support for declarative data layouts. 3) NDP use of shared virtual memory.

### PIM Survey Published

T. Vincon, A. Koch, I. Petrov. Moving Processing to Data:On the Influence of Processing-in-Memory on Data Management. arXiv.

Near-Data Processing, ideally allows executing application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage.

Abstract:

Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and resource-efficiency. Processing-in-Memory is a sub-class of Near-Data processing that targets data processing directly within memory (DRAM) chips. The effective use of Near-Data Processing mandates new architectures, algorithms, interfaces, and development toolchains.

### nativeNDP: Processing Big Data Analytics on Native Storage Nodes

T. Vincon, S. Hardock, C. Riegger, A. Koch, I. Petrov. nativeNDP: Processing Big Data Analytics on Native Storage Nodes. In Proc. ADBIS 2019.

We propose nativeNDP â a framework for Near-Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node.

Abstract:

Data analytics tasks on large datasets are computationally- intensive and often demand the compute power of cluster environments. Yet, data cleansing, preparation, dataset characterization and statistics or metrics computation steps are frequent. These are mostly performed ad hoc, in an explorative manner and mandate low response times. But, such steps are I/O intensive and typically very slow due to low data locality, inadequate interfaces and abstractions along the stack. These typically result in prohibitively expensive scans of the full dataset and transformations on interface boundaries. In this paper we examine R as analytical tool, managing large persis- tent datasets in Ceph, a wide-spread cluster file-system. We propose nativeNDP â a framework for Near-Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node. Across a range of data sizes, we show that na- tiveNDP is more than an order of magnitude faster than other pushdown alternatives.

### Indexing large updatable Datasets in Multi-Version Database Management Systems

C. Riegger, T. Vinccon, I. Petrov. Indexing large updatable Datasets in Multi-Version Database Management Systems. In Proc. IDEAS 2019.

In this paper we present the implementation of Partitined B-Trees in PostgreSQL extended with SIAS.

Abstract:

Database Management Systems (DBMS) need to handle large updatable datasets under OLTP workloads. Most modern DBMS provide snapshots of data in MVCC transaction management scheme. Each transaction operates on a snapshot of the database. It is calculated from a set of tuple versions, containing logical transaction timestamps. This transaction management scheme enables high parallelism and resource-efficient append-only data placement on secondary storage. One major issue in indexing tuple versions on modern hardware technologies is the high write amplification for tree-indexes. Partitioned B-Trees (PBT) is based on the structure and algorithms of the ubiquitous B+-Tree. They achieve a near optimal write amplification and beneficial sequential writes on secondary storage. In this paper we present the implementation of PBTs in PostgreSQL extended with SIAS. Compared to PostgreSQL's standard B+-Trees PBTs have 50% better transactional throughput under TPC-C.

### IPA-IDX: In-Place Appends for B-Tree Indices

S. Hardock, A. Koch, T. Vinccon, I. Petrov. IPA-IDX: In-Place Appends for B-Tree Indices. In Proc. DaMoN 2019.

IPA-IDX is an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records.

Paper Accepted at DaMoN 2019

S. Hardock, A. Koch, T. Vinccon, I. Petrov. IPA-IDX: In-Place Appends for B-Tree Indices. In Proc. DaMoN 2019.

Abstract:

We introduce IPA-IDX - an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records. IPA-IDX provides similar performance and longevity advantages for indexes as basic IPA does for tables. The selective application of IPA-IDX and basic IPA to certain regions and objects, lowers the GC overhead by over 60%, while keeping the total space overhead to 2%. The combined effect of IPA and IPA-IDX increases performance by 28%.

### Native Storage Techniques for Data Management

I. Petrov, A. Koch, S. Hardock, T. Vincon, C. Riegger
In Proc. ICDE 2019

Native storage approaches, architectures and techniques for data processing and data management.

21.11.2019 Paper Accepted at ICDE 2019

I. Petrov, A. Koch, S. Hardock, T. Vincon. C. Riegger Native Storage Techniques for Data Management. In Proc. ICDE 2019.

Abstract:

In the present tutorial we perform a cross-cut analysis of database storage management from the perspective of modern storage technologies. We argue that neither the design of modern DBMS, nor the architecture of modern storage technologies are aligned with each other. Moreover, the majority of the systems rely on a complex multi-layer and compatibility-oriented storage stack. The result is needlessly suboptimal DBMS performance, inefficient utilization, or significant write amplification due to outdated abstractions and interfaces. In the present tutorial we focus on the concept of native storage, which is storage operated without intermediate abstraction layers over an open native storage interface and is directly controlled by the DBMS. We cover the following aspects of native storage: (i) architectural approaches and techniques; (ii) interfaces; (iii) storage abstractions; (iv) DBMS/system integration; (v) in-storage processing.

### DBLab has open-sourced NoFTL, SIAS and cIPT

Check out DBLab's GitHub repository.
We have open-sourced NoFTL, SIAS, and cIPT.

### New Project Grant

PANDAS: Programmable Appliance for Near Data Processing Accelerated Storage

Funding agency: BMBF

Principle Investigators:
PRO DESIGN Electronic GmbH
Xelera Technologies GmbH
Embedded Systems and Applications Group, Technische Universitaet Darmstadt
Data Management Lab, Reutlingen University

### Efficient Data and Indexing Structure for Blockchains in Enterprise Systems

C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2018

17.09.2018 Paper Accepted at iiWAS 2017

C. Riegger, T. Vincon, I. Petrov. Efficient Data and Indexing Structure for Blockchains in Enterprise Systems. In Proc. iiWAS 2018.

Abstract:

Blockchains yield to new workloads in database management systems and K/V-Stores. Distributed Ledger Technology (DLT) is a technique for managing transactions in âtrustlessâ distributed systems. Yet, clients of nodes in blockchain networks are backed by âtrustworthyâ K/V-Stores, like LevelDB or RocksDB in Ethereum, which are based on Log-Structured Merge Trees (LSM-Trees). However, LSM-Trees do not fully match the properties of blockchains and enterprise workloads. In this paper, we claim that Partitioned B-Trees (PBT) fit the proper- ties of this DLT: uniformly distributed hash keys, immutability, consensus, invalid blocks, unspent and off-chain transactions, reorganization and data state / version ordering in a distributed log-structure. PBT can locate records of newly inserted key-value pairs, as well as data of unspent transactions, in separate partitions in main memory. Once several blocks acquire consensus, PBTs evict a whole partition, which becomes immutable, to sec- ondary storage. This behavior minimizes write amplification and enables a beneficial sequential write pattern on modern hardware. Furthermore, DLT implicate some type of log-based versioning. PBTs can serve as MV-Store for data storage of logical blocks and indexing in multi-version concurrency control (MVCC) transaction processing.

### Two entries in Encyclopedia of Big Data Technologies, Sakr, Sherif, Zomaya, Albert (Eds.), Springer

I. Petrov, T. Vincon, A. Koch, J. Oppermann, S. Hardock, C. Riegger. Active Storage
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.

I. Petrov, A. Koch, T. Vincon, S. Hardock, C. Riegger. Transaction Processing on NVM
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.

### NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management

T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov.
In Proc. EDBT 2018

22.12.2017 Paper Accepted at EDBT 2018

T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov. NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management. In Proc. EDBT 2018.

[PDF]

Abstract:

Modern persistent Key/Value stores are designed to meet the demand for high transactional throughput and high data-ingestion rates. Still, they rely on backwards-compatible storage stack and abstractions to ease space management, foster seamless proliferation and system integration. Their dependence on the traditional I/O stack has negative impact on performance, causes unacceptably high write-amplification, and limits the storage longevity.
In the present paper we present NoFTL-KV, an approach that results in a lean I/O stack, integrating physical storage management natively in the Key/Value store. NoFTL-KV eliminates backwards compatibility, allowing the Key/Value store to directly consume the characteristics of modern storage technologies. NoFTL-KV is implemented under RocksDB. The performance evaluation under LinkBench shows that NoFTL-KV improves transactional throughput by 33%, while response times improve up to 2.3x. Furthermore, NoFTL-KV reduces write-amplification 19x and improves storage longevity by imately the same factor.

### Multi-Version Indexing and modern Hardware Technologies

#### A Survey of present Indexing Approaches

C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017

02.10.2017 Paper Accepted at iiWAS 2017

C. Riegger, T. Vincon, I. Petrov. Multi-Version Indexing and modern Hardware Technologies - A Survey of present Indexing Approaches. In Proc. iiWAS 2017.

[PDF]

Abstract:

Characteristics of modern computing and storage technologies fundamentally differ from traditional hardware. There is a need to optimally leverage their performance, endurance and energy consumption characteristics. Therefore, existing architectures and algorithms in modern high performance database management systems have to be redesigned and advanced. Multi Version Concurrency Control (MVCC) approaches in data-base management systems maintain multiple physically independent tuple versions. Snapshot isolation approaches enable high parallelism and concurrency in workloads with almost serializable consistency level. Modern hardware technologies benefit from multi-version approaches. Indexing multi-version data on modern hardware is still an open research area. In this paper, we provide a survey of popular multi-version indexing approaches and an extended scope of high performance single-version approaches. An optimal multi-version index structure brings look-up efficiency of tuple versions, which are visible to transactions, and effort on index maintenance in balance for different workloads on modern hardware technologies.

### Write-Optimized Indexing with Partitioned B-Trees

C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017

02.10.2017 Paper Accepted at iiWAS 2017

C. Riegger, T. Vincon, I. Petrov. Write-Optimized Indexing with Partitioned B-Trees. In Proc. iiWAS 2017.

[PDF]

Abstract:

Database management systems (DBMS) are critical performance component in large scale applications under modern update-intensive workloads. Additional access paths accelerate look-up performance in DBMS for frequently queried attributes, but the required maintenance slows down update performance. The ubiquitous B + -Tree is a commonly used key-indexed access path that is able to support many required functionalities with logarithmic access time to requested records. Modern processing and storage technologies and their characteristics require reconsideration of matured indexing approaches for todayâs workloads. Partitioned B-Trees (PBT) leverage characteristics of modern hardware technologies and complex memory hierarchies as well as high update rates and changes in workloads by maintaining partitions within one single B + -Tree. This paper includes an experimental evaluation of PBTs optimized write pattern and performance improvements. With PBT transactional throughput under TPC-C increases 30%; PBT results in beneficial sequential write patterns even in presence of updates and maintenance operations.

### SIAS-Chains: Snapshot Isolation Append Storage Chains

R. Gottstein, I. Petrov, S. Hardock, A. Buchmann

27.8.2017 Paper Accepted at ADMS@VLDB 2017

R. Gottstein, I. Petrov, S. Hardock, A. Buchmann. SIAS-Chains: Snapshot Isolation Append Storage Chains. In Proc. ADMS@VLDB 2017.

[PDF]

Abstract:

Asymmetric read/write storage technologies such as Flash are becoming a dominant trend in modern database systems.They introduce hardware characteristics and properties which are fundamentally different from those of traditional storage technologies such as HDDs.

Multi-Versioning Database Management Systems (MV-DBMSs) and Log-based Storage Managers (LbSMs) are concepts that can effectively address the properties of these storage technologies but are designed for the characteristics of legacy hardware. A critical component of MV-DBMSs is the invalidation model. Transactional timestamps are assigned to the old and the new version, resulting in two independent (physical) update operations. Those entail multiple random writes as well as in-place updates, sub-optimal for new storage technologies both in terms of performance and endurance. Traditional page-append LbSM approaches alleviate random writes and immediate in-place updates, hence reducing the negative impact of Flash read/write asymmetry. Nevertheless, they entail significant mapping overhead, leading to write amplification.

In this work we present the Snapshot Isolation Append Storage Chains (SIAS-Chains) that employs a combination of multi-versioning with append storage management in tuple granularity and novel singly-linked (chain-like) version organization.

SIAS-Chains features simplified buffer management, multi-version indexing and introduces read/write optimizations to data placement on modern storage media. SIAS-Chains algorithmically avoids small in-place updates, caused by in-place invalidation and converts them into appends. Every modification operation is executed as an append and recently inserted tuple versions are co-located. SIAS-Chains is implemented in PostgreSQL and evaluated on modern Flash SSDs with standard update-intensive workload. The performance evaluation under PostgreSQL shows: (i) higher transactional throughput - up to 30 percent; (ii) significantly lower response times - up to 7 times lower; (iii) significant write reduction - up to 97 percent reduction; (iv) reduced space consumption and (v) higher tolerable workload.

### Paper Accepted at ICDE 2017

Selective In-Place Appends for Real: Reducing Erases on Wear-prone DBMS Storage
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. ICDE 2017 [PDF] [Video]

Abstract: In the present paper we demonstrate the novel technique to apply the recently proposed approach of In-Place Appends â overwrites on Flash without a prior erase operation. IPA can be applied selectively: only to DB-objects that have frequent and relatively small updates. To do so we couple IPA to the concept of NoFTL regions, allowing the DBA to place update-intensive DB-objects into special IPA-enabled regions. The decision about region configuration can be (semi-)automated by an advisor analyzing DB-log files in the background.

We showcase a Shore-MT based prototype of the above approach, operating on real Flash hardware. During the demon- stration we allow the users to interact with the system and gain hands-on experience under different demonstration scenarios.

### Paper Accepted at SIGMOD 2017

S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. SIGMOD 2017 [PDF]

Abstract: Under update intensive workloads (TPC, LinkBench) small updates dominate the write behavior, e.g. 70% of all updates change less than 10 bytes across all TPC OLTP workloads. These are typically performed as in-place updates and result in random writes in page-granularity, causing major write-overhead on Flash storage, a write amplification of several hundred times and lower device longevity.

In this paper we propose an approach that transforms those small in-place updates into small update deltas that are appended to the original page. We utilize the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. Furthermore, we extend the traditional NSM page-layout with a delta-record area that can absorb those small updates. We propose a scheme to control the write behavior as well as the space allocation and sizing of database pages. We describe how the DBMS buffer and storage manager must be adapted to handle page operations.

The proposed approach has been implemented under Shore-MT and evaluated on real Flash hardware (OpenSSD) and a Flash emulator. Compared to In-Page Logging it performs up to 62% less reads and writes and up to 74% less erases on a range of workloads. The experimental evaluation indicates: (i) significant reduction of erase operations resulting in twice the longevity of Flash devices under update-intensive workloads; (ii) 15%-60% lower read/write I/O latencies; (iii) up to 45% higher transactional throughput; (iv) 2x to 3x reduction in overall write amplification.

### Paper Accepted at EDBT 2017

In-Place Appends for Real: DBMS Overwrites on Flash without Erase
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. In Proc. EDBT 2017 [PDF]

### 08.08.2013 DFG Flashy-DB Extended

The DFG (Deutsche Forschungsgemeinschaft) has extended FlashyDB the research project. DFG FlashyDB is aiming to investigate the influence of Flash Memory on the database architecture, performance and algorithms.

### 06.08.2013 Program Committee Memberships

Members of the DBlab serve on the program committee of SIGMOD 2014 (Demo Track).

### DBlab Talk

Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland will give talk 'Managing Large Data Volumes Efficiently with IBM Netezza' on 18.06.2013 at 11:30 in 9-003.

Title:
Managing Large Data Volumes Efficiently with IBM Netezza

Who:
Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland

When:
am Di. 18.06.2013 um 11:30Uhr | Raum 9-003

Abstrakt [PDF]:
Netezza is a highly specialized database management system for data warehousing operations. In this presentation, Dr. Knut Stolze gives an overview of the its system architecture and the internal query processing. It is shown how very good performance can be delivered with a very simple user interface that avoids indexes. Next, the IBM DB2 Analytics Accelerator is an integration project and commercial product that combines the strengths of Netezza's analytic query processing capabilities with DB2's superior OLTP performance. Knut highlights how the integration of both products is achieved in a (nearly) seamless way.

Zum Vortragenden:
Dr. Knut Stolze is working for the Information Management department in the IBM Research [&] Development Lab in BÃ¶blingen, Germany. He focuses on relational database systems, specifically large scale data warehouse systems. He gained his expertise and experience in academic and industrial research and in product development. His current research efforts focus on enterprise data warehouse systems, in particular technologies like in-memory, specialty hardware for high performance query processing, and database federation. Knut Stolze is a senior software developer and master inventor at IBM. In his role as an architect, he is responsible for the design and implementation of the IBM DB2 Analytics Accelerator for z/OS. Prior to the current project, Dr. Stolze worked in the DB2 Spatial Extender development team, earned his PhD at the University of Jena, Germany, in 2006, and subsequently moved on the the DB2 z/OS Utilities development.

### Accepted at VLDB 2013

S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. NoFTL: Database Systems on FTL-less Flash Storage. VLDB 2013 (Demonstrations Track). Riva del Garda, August 26-31, 2013. [Demonstration Video]

Abstract

The database architecture and workhorse algorithms have been designed to compensate for hard disk properties. The I/O characteristics of Flash memories have significant impact on database systems and many algorithms and approaches taking advantage of those have been proposed recently. Nonetheless on system level Flash storage devices are still treated as HDD compatible block devices, black boxes and fast HDD replacements. This backwards compatibility (both software and hardware) masks the native behaviour, incurs significant complexity and decreases I/O performance, making it non-robust and unpredictable. Database systems have a long tradition of operating directly on RAW storage natively, utilising the physical characteristics of storage media to improve performance.

In this paper we demonstrate an approach called NoFTL that goes a step further. We show that allowing for native Flash access and integrating parts of the FTL functionality into the database system yields significant performance increase and simplification of the I/O stack. We created a real-time data-driven Flash emulator and integrated it accordingly into Shore-MT. We demonstrate a performance improvement of up to 3.7x compared to Shore-MT on RAW block-device Flash storage under various TPC workloads.

### Best Paper Awards

Papers co-authored by members of the DBlab have been given Best Paper Awards:

### Accepted Paper

R. Gottstein, I. Petrov, and A. Buchmann. Append storage in multi-version databases on flash. In Proc. of BNCOD 2013. Springer-Verlag, 2013.

### DBKDA Papers

El-Sheikh, E., Bagui, S., Firesmith, D.G., Petrov, I., Wilde, N., Zimmermann, A.: Towards Semantic-Supported SmartLife System Architectures for Big Data Services in the Cloud. In Proc. Service Computation'13, (2013)

### PC Memberships

iiWAS2013 and ACM PIKM 2013

Members of the DBlab serve on the program comittees of iiWAS2013 and PIKM 2013, at the ACM CIKM 2013.

### DBlab Technical Report

G. Graefe, I. Petrov, T. Ivanov, V. Marinov. A hybrid page layout integrating PAX and NSM. Technical Report (HPL-2012-240). 2012

A technical report (HPL-2012-240) entitled 'A hybrid page layout integrating PAX and NSM' has been published as a cooperation between Hewlett-Packard Laboratories, DBlab, Reutlingen University, DVS, Technical University Darmstadt.
The report available online under: http://www.hpl.hp.com/techreports/2012/HPL-2012-240.html

### DBKDA Papers

DBKDA Paper Accepted

Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Aspects of Append-Based Database Storage Management on Flash Memories.In Proc.DBKDA 2013.

### DBlab Talk

Robert Gottstein (Databases and Distributed Systems Group, TU-Darmstadt) will give talk on the influence of new storage technologies on database systems.

Title: Data Intensive Systems on New Storage Technologies[pdf]
When: 13.12.2012 at 13:00
Where: 9-108.

Abstract: [pdf]

As new storage technologies with radically different properties are appearing (Flash and Non-Volatile Memories), a substantial architectural redesign is required if they are to be used efficiently in a high-performance data-intensive system.

\

Multi-Version approaches to database systems (MVCC, SI) are gaining significant importance and become a dominating trend. They not only offer characteristics that meet the requirements of enterprise workloads, but also provide concepts that can effectively address the properties of new storage technologies. Yet version management may produce unnecessary random writes which are suboptimal for the new technologies.

A variant of SI called SI-CV collocates tuple versions, created by a transaction, in adjacent blocks and minimizes random writes at the cost of random reads. Its performance, relative to the original algorithm, in overloaded systems under heavy transactional loads in TPC-C scenarios on Flash SSD storage increases significantly. At high loads that bring the original system into overload, the transactional throughput of SI-CV increases further, while maintaining response times that are multiple factors lower. SI produces a new version of a data item once it is modified. Both the new and the old version are timestamped accordingly, which in many cases results in two independent (physical) update operations, entailing multiple random writes as well as in-place updates. These are also suboptimal for new storage technologies both in terms of performance and endurance.

We claim that the combination of multiversioning and append storage effectively addresses the characteristics of modern storage technologies.Snapshot Isolation Append Storage (SIAS) improves on SI and traditional "page granularity" append based storage managers. It manages versions as simply linked lists (chains) that are adressed by using a virtual tuple ID (VID). In SIAS the creation of a new version implicitly invalidates the old one resulting in an out-of-place write implemented as a logical append eliminating the need for invalidation timestamps. SIAS is coupled to an append-based storage manager, appending units of tuple versions. SIAS indicates up to 4x performance improvement on Flash SSD under TPC-C workload, entailed by a significant write overhead reduction (up to 38x). SIAS achieves better space utilization due to denser version packing per page and allows for better I/O parallelism and up to 4x lower disk I/O execution times. SIAS aids better endurance, due to use of out-of-place writes as appends and write overhead reduction. Compared to traditional page granularity appends, SIAS achives up to 85% higher read throughput and up to 38x write reduction.

### DBKDA Paper

DBKDA Paper Accepted

DBKDA Paper Accepted

Christian Abele, Michael Schaidnagel, Fritz Laux, Ilia Petrov. Sales Prediction with Parametrized Time Series Analysis. In Proc. DBKDA 2013.

### Data Mining Cup 2012

27.06.2012

Michael Scheidnagel and Christian Abele students in the Data Managment Lab earn 7th place in the overall ranking for the second assignment of the Data Mining Cup 2012.