News

Two entries in Encyclopedia of Big Data Technologies, Sakr, Sherif, Zomaya, Albert (Eds.), Springer

I. Petrov, T. Vincon, A. Koch, J. Oppermann, S. Hardock, C. Riegger. Active Storage
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.

I. Petrov, A. Koch, T. Vincon, S. Hardock, C. Riegger. Transaction Processing on NVM
In Enc. Big Data Technologies Sakr, Zomaya (Eds.) Springer 2018.

NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management

T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov.
In Proc. EDBT 2018

read more ...

22.12.2017 Paper Accepted at EDBT 2018

T. Vincon, S. Hardock C. Riegger, J. Oppermann, A. Koch, I. Petrov. NoFTL-KV: Tackling Write-Amplification on KV-Stores with Native Storage Management. In Proc. EDBT 2018.

[PDF]

Abstract:

Modern persistent Key/Value stores are designed to meet the demand for high transactional throughput and high data-ingestion rates. Still, they rely on backwards-compatible storage stack and abstractions to ease space management, foster seamless proliferation and system integration. Their dependence on the traditional I/O stack has negative impact on performance, causes unacceptably high write-amplification, and limits the storage longevity.
In the present paper we present NoFTL-KV, an approach that results in a lean I/O stack, integrating physical storage management natively in the Key/Value store. NoFTL-KV eliminates backwards compatibility, allowing the Key/Value store to directly consume the characteristics of modern storage technologies. NoFTL-KV is implemented under RocksDB. The performance evaluation under LinkBench shows that NoFTL-KV improves transactional throughput by 33%, while response times improve up to 2.3x. Furthermore, NoFTL-KV reduces write-amplification 19x and improves storage longevity by imately the same factor.

Multi-Version Indexing and modern Hardware Technologies

A Survey of present Indexing Approaches

C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017

read more ...

02.10.2017 Paper Accepted at iiWAS 2017

C. Riegger, T. Vincon, I. Petrov. Multi-Version Indexing and modern Hardware Technologies - A Survey of present Indexing Approaches. In Proc. iiWAS 2017.

[PDF]

Abstract:

Characteristics of modern computing and storage technologies fundamentally differ from traditional hardware. There is a need to optimally leverage their performance, endurance and energy consumption characteristics. Therefore, existing architectures and algorithms in modern high performance database management systems have to be redesigned and advanced. Multi Version Concurrency Control (MVCC) approaches in data-base management systems maintain multiple physically independent tuple versions. Snapshot isolation approaches enable high parallelism and concurrency in workloads with almost serializable consistency level. Modern hardware technologies benefit from multi-version approaches. Indexing multi-version data on modern hardware is still an open research area. In this paper, we provide a survey of popular multi-version indexing approaches and an extended scope of high performance single-version approaches. An optimal multi-version index structure brings look-up efficiency of tuple versions, which are visible to transactions, and effort on index maintenance in balance for different workloads on modern hardware technologies.

Write-Optimized Indexing with Partitioned B-Trees

C. Riegger, T. Vincon, I. Petrov.
In Proc. iiWAS 2017

read more ...

02.10.2017 Paper Accepted at iiWAS 2017

C. Riegger, T. Vincon, I. Petrov. Write-Optimized Indexing with Partitioned B-Trees. In Proc. iiWAS 2017.

[PDF]

Abstract:

Database management systems (DBMS) are critical performance component in large scale applications under modern update-intensive workloads. Additional access paths accelerate look-up performance in DBMS for frequently queried attributes, but the required maintenance slows down update performance. The ubiquitous B + -Tree is a commonly used key-indexed access path that is able to support many required functionalities with logarithmic access time to requested records. Modern processing and storage technologies and their characteristics require reconsideration of matured indexing approaches for today’s workloads. Partitioned B-Trees (PBT) leverage characteristics of modern hardware technologies and complex memory hierarchies as well as high update rates and changes in workloads by maintaining partitions within one single B + -Tree. This paper includes an experimental evaluation of PBTs optimized write pattern and performance improvements. With PBT transactional throughput under TPC-C increases 30%; PBT results in beneficial sequential write patterns even in presence of updates and maintenance operations.

SIAS-Chains: Snapshot Isolation Append Storage Chains

R. Gottstein, I. Petrov, S. Hardock, A. Buchmann
In Proc. ADMS@VLDB 2017

read more ...

27.8.2017 Paper Accepted at ADMS@VLDB 2017

R. Gottstein, I. Petrov, S. Hardock, A. Buchmann. SIAS-Chains: Snapshot Isolation Append Storage Chains. In Proc. ADMS@VLDB 2017.

[PDF]

Abstract:

Asymmetric read/write storage technologies such as Flash are becoming a dominant trend in modern database systems.They introduce hardware characteristics and properties which are fundamentally different from those of traditional storage technologies such as HDDs.

Multi-Versioning Database Management Systems (MV-DBMSs) and Log-based Storage Managers (LbSMs) are concepts that can effectively address the properties of these storage technologies but are designed for the characteristics of legacy hardware. A critical component of MV-DBMSs is the invalidation model. Transactional timestamps are assigned to the old and the new version, resulting in two independent (physical) update operations. Those entail multiple random writes as well as in-place updates, sub-optimal for new storage technologies both in terms of performance and endurance. Traditional page-append LbSM approaches alleviate random writes and immediate in-place updates, hence reducing the negative impact of Flash read/write asymmetry. Nevertheless, they entail significant mapping overhead, leading to write amplification.

In this work we present the Snapshot Isolation Append Storage Chains (SIAS-Chains) that employs a combination of multi-versioning with append storage management in tuple granularity and novel singly-linked (chain-like) version organization.

SIAS-Chains features simplified buffer management, multi-version indexing and introduces read/write optimizations to data placement on modern storage media. SIAS-Chains algorithmically avoids small in-place updates, caused by in-place invalidation and converts them into appends. Every modification operation is executed as an append and recently inserted tuple versions are co-located. SIAS-Chains is implemented in PostgreSQL and evaluated on modern Flash SSDs with standard update-intensive workload. The performance evaluation under PostgreSQL shows: (i) higher transactional throughput - up to 30 percent; (ii) significantly lower response times - up to 7 times lower; (iii) significant write reduction - up to 97 percent reduction; (iv) reduced space consumption and (v) higher tolerable workload.

Paper Accepted at ICDE 2017

read more ...

Selective In-Place Appends for Real: Reducing Erases on Wear-prone DBMS Storage

S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. ICDE 2017 [PDF] [Video]

Abstract: In the present paper we demonstrate the novel technique to apply the recently proposed approach of In-Place Appends – overwrites on Flash without a prior erase operation. IPA can be applied selectively: only to DB-objects that have frequent and relatively small updates. To do so we couple IPA to the concept of NoFTL regions, allowing the DBA to place update-intensive DB-objects into special IPA-enabled regions. The decision about region configuration can be (semi-)automated by an advisor analyzing DB-log files in the background.

We showcase a Shore-MT based prototype of the above approach, operating on real Flash hardware. During the demon- stration we allow the users to interact with the system and gain hands-on experience under different demonstration scenarios.

Paper Accepted at SIGMOD 2017

read more ...

From In-Place Updates to In-Place Appends: Revisiting Out-of-Place Updates on Flash.
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. SIGMOD 2017 [PDF]

Abstract: Under update intensive workloads (TPC, LinkBench) small updates dominate the write behavior, e.g. 70% of all updates change less than 10 bytes across all TPC OLTP workloads. These are typically performed as in-place updates and result in random writes in page-granularity, causing major write-overhead on Flash storage, a write amplification of several hundred times and lower device longevity.

In this paper we propose an approach that transforms those small in-place updates into small update deltas that are appended to the original page. We utilize the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. Furthermore, we extend the traditional NSM page-layout with a delta-record area that can absorb those small updates. We propose a scheme to control the write behavior as well as the space allocation and sizing of database pages. We describe how the DBMS buffer and storage manager must be adapted to handle page operations.

The proposed approach has been implemented under Shore-MT and evaluated on real Flash hardware (OpenSSD) and a Flash emulator. Compared to In-Page Logging it performs up to 62% less reads and writes and up to 74% less erases on a range of workloads. The experimental evaluation indicates: (i) significant reduction of erase operations resulting in twice the longevity of Flash devices under update-intensive workloads; (ii) 15%-60% lower read/write I/O latencies; (iii) up to 45% higher transactional throughput; (iv) 2x to 3x reduction in overall write amplification.

Paper Accepted at EDBT 2017

read more ...

In-Place Appends for Real: DBMS Overwrites on Flash without Erase
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. In Proc. EDBT 2017 [PDF]

Abstract: Flash SSD is THE second tier storage for DBMS nowadays. Compared to HDDs, it is faster, consumes less power, produces less heat, and it is cheaper in terms of $/IOPS. Furthermore, the replacement of HDDs with SSDs is typically trivial since both kinds of storage devices utilize the same block-device interface and not seldom the same physical interfaces.

The recent research has shown that masking the management and the properties of native Flash memory using the black-box abstraction realized by the on-device Flash Translation Layer (FTL) significantly lowers the performance and endurance characteristics of Flash. The alternative is the utilization of open Flash interfaces. In this paper we follow this idea and propose an extension to it - the approach of In-Place Appends.

In the present paper we demonstrate a novel approach to handling small updates on Flash called In-Place Appends (IPA). It allows the DBMS to revisit the traditional write behavior on Flash. Instead of writing whole database pages upon an update in an out-of-place manner on Flash, we transform those small updates into update deltas and append them to a reserved area on the very same physical Flash page. In doing so we utilize the commonly ignored fact, that under certain conditions Flash memories can support in-place updates to Flash pages without a preceding erase operation.

\

The approach was implemented under Shore-MT and evaluated on real hardware. Under standard update-intensive workloads we observed 67% less page invalidations resulting in 80% lower garbage collection overhead, which yields a 45% increase in transactional throughput, while doubling Flash longevity at the same time. The IPA outperforms In-Page Logging (IPL) by more than 50%.

We showcase a Shore-MT based prototype of the above approach, operating on real Flash hardware -- the OpenSSD Flash research platform. During the demonstration we allow the users to interact with the system and gain hands-on experience of its performance under different demonstration scenarios. These involve various workloads such as TPC-B, TPC-C or TATP.

Congrats Robert!

read more ...

DBlab congratulates Robert Gottstein on the occation of the successful defence of his docotral thesis entitled "Impact of new storage technologies on an OLTP DBMS, its architecture and algorithms". [PDF]

Abstract:

\

New developments in hardware storage technology introduce fundamentally different performance characteristics and device properties. Storage technologies such as Flash and Non Volatile Memories (NVMs) are asymmetric in terms of their read and write performance, they read much faster than they write. Modern DBMSs are not aware of the underlying asymmetric storage technologies. They are well-developed systems and in principle capable of working with asymmetric storage technologies as a mere replacement, yet they fail to exploit their key properties. Huge performance potential is lying idle and durability of the storage media is shortened which ultimately leads to higher costs. This work is a remedy for those shortcomings, making the DBMS aware of the underlying asymmetric Flash storage and questioning existing multi-version DBMS (MV-DBMS) architecture, algorithms and optimizations. We exploit the performance potential of the asymmetric Flash storage and increase its durability. A re-evaluation and redesign of components within the DBMS is necessary, inevitably leading to a redesign of the whole DBMS. Without such a redesign, the DBMS software stack will become the new I/O bottleneck. The combination of the MV-DBMS, multi-versioning concurrency control (MVCC) and append/log-based storage management (LbSM) on Flash storage delivers optimal performance figures which are needed to satisfy the urgent demand in scalable performance for modern DBMSs.

mhp-Award for best Bachalor’s Project WS14/15

read more ...

Congratulations!

The students Felix Heldmaier, Florian Grötzner, Niels Shuchmacher, Samuel Sailer, Steffen Höser, Florian Hofstädter, Yannik Scheible, Yu-Ninig Wang (Exchange Student NCTU, Taiwan), Jiawei Liu (Exchange Student Donghua-University, Shangai, China) won the mhp-Award for their Bachalor’s Project “Performance analysis of search algorithms for database indices”.


Project Dexription (in German):

Ziel dieses Projektes ist es, experimentell das Leistungsverhalten von Datenbank-Indizes hinsichtlich unterschiedlich verteilter Daten zu untersuchen. Die Arbeitslast besteht aus klassischen Datenbankanfragen wie z.B. Punkt- und Intervalabfragen.

Im Fokus stand, den Einfluss der Datenverteilungen auf die Indexleistung zu bestimmen. Dafür wurden Datengeneratoren entworfen, die eine variabel große Datenmenge nach neuen vordefinierten statistischen Verteilungen generieren: Weibul, Cosine, Cauchy, Normal, Lognormal, Exponential, Doublelog, Parabolic, Extremevalue. Dadurch wurden reale Daten in Datenbanksystemen approximiert. Darüber hinaus wurden statistische Testmethoden (z.B. Chi2-Test) umgesetzt, um die tatsächliche Verteilung der generierten Daten zu validieren.

Die so erstellten Daten wurden in mehreren Datenbanksystemen abgelegt und darauf wurden Indizes erzeugt. Das Index-Leistungsverhalten wurde durch eigens entworfene Micro-Benchmarks gemessen. Das Datenbanksystem wurde in einen Initialzustand (Rampup-Phase) versetzt, um nachfolgend eine Reihe von verschiedenen Datenbankabfragen wiederholt auszuführen und deren Antwortszeiten zu messen. Die Messergebnisse werden grafisch dargestellt.

Der gesamte Messstand, bestehend aus Datengeneratoren, Benchmarks und Ergebnisdarstellung hat sowohl eine graphische Benutzeroberfläche für den persönlichen Gebrauch als auch eine Kommandozeilebedienung für den Servereinsatz.

Paper accepted at EDBT 2015

read more ...

NoFTL for Real: Databases on Real Native Flash Storage. In Proc. 18th International Conference on Extending Database Technology, Brussels, Belgium (EDBT 2015). [PDF] [Video]

Abstract:

Flash SSDs are omnipresent as database storage. HDD re- placement is seamless since Flash SSDs implement the same legacy hardware and software interfaces to enable backward compatibility. Yet, the price paid is high as backward com- patibility masks the native behaviour, incurs significant com- plexity and decreases I/O performance, making it non-robust and unpredictable. Flash SSDs are black-boxes. Although DBMS have ample mechanisms to control hardware directly and utilize the performance potential of Flash memory, the legacy interfaces and black-box architecture of Flash devices prevent them from doing so.
In this paper we demonstrate NoFTL, an approach that enables native Flash access and integrates parts of the Flash- management functionality into the DBMS yielding signif- icant performance increase and simplification of the I/O stack. NoFTL is implemented on real hardware based on the OpenSSD research platform. The contributions of this paper include: (i) a description of the NoFTL native Flash storage architecture; (ii) its integration in Shore-MT and (iii) performance evaluation of NoFTL on a real Flash SSD and on an on-line data-driven Flash emulator under TPC- B,C,E and H workloads. The performance evaluation results indicate an improvement of at least 2.4x on real hardware over conventional Flash storage; as well as better utilisation of native Flash parallelism.

Associated Members

read more ...

DBlab welcomes Robert Gottstein and Sergey Hardock as associated members. Their research interests are in the field of data management on modern storage technolgies. Both of them are affiliated with the databases and distributed systems group (DVS) at Technische Universtät Darmstadt.

Associated Member of DFG GK 1994 AIPHES

read more ...

Ilia Petrov is appointed an associated member of DFG GK 1994 AIPHES:Adaptive Preparation of Information from Heterogeneous Sources.
He is responsible for database integration and high-performance data processing.

Ph.D. Scholarship

read more ...

DBlab participtates with one Ph.D. in the newly appointed graduate school "Services Computing" in coopertaion with Universität Stuttgart. The research direction is in the field of BigData and high-performance data management and analytics.

Paper Accepted at EDBT 2016

read more ...

Revisiting DBMS Space Management for Native Flash.
S. Hardock, I. Petrov, R. Gottstein, A. Buchmann.
In Proc. EDBT 2016.

Paper accepted at iiWAS 2015

read more ...

Real Time Charging Database Benchmarking.
J. Bogner, C. Dehner, T. Vincon, I. Petrov.
In Proc. iiWAS 2015.

Best Paper Award at DBKDA 2015

read more ...

Best paper award for
Tim Lessner, Fritz Laux
O|R|P|E - A Data Semantics Driven Concurrency Control Mechanism

In Proceedings DBKDA 2015 - The Seventh International Conference on Advances in Databases, Knowledge, and Data Applications, pp 147-152, May 24-29, 2015 - Rome, Italy

Paper Accepted at ICDE 2015

read more ...

27.11.2014 Paper Accepted at ICDE 2015

I. Petrov, R. Gottstein, S. Hardock. DBMS on Modern Storage Hardware. In Proc. International Conference on Data Engineering (ICDE) 2015.

[Slides]

Abstract:

In the present tutorial we perform a cross-cut analysis of database systems from the perspective of modern storage technology, namely Flash memory. We argue that neither the design of modern DBMS, nor the architecture of Flash storage technologies are aligned with each other. The result is needlessly suboptimal DBMS performance and inefficient Flash utilisation as well as low Flash storage endurance and reliability.

We showcase new DBMS approaches with improved algorithms and leaner architectures, designed to leverage the properties of modern storage technologies. We cover the area of transaction management and multi-versioning, putting a special emphasis on: (i) version organisation models and invalidation mechanisms in multi-versioning DBMS; (ii) Flash storage management especially on append-based storage in tuple granularity; (iii) Flash-friendly buffer management; as well as (iv) improvements in the searching and indexing models.
Furthermore, we present our NoFTL approach to native Flash access that integrates parts of the Flash-management functionality into the DBMS yielding significant performance increase and simplification of the I/O stack. In addition, we cover the basics of building large Flash storage for DBMS and revisit some of the RAID techniques and principles.

DBlab Talk

read more ...

Ilia Petrov ist giving a talk "Advances in Flashing the Database Storage" hosted by the GI local group Stuttgart/Böblingen. The event takes place on Mo, 6.Oct.2014, 18:15 – 20:00 at Uni-Stuttgart.

Tim Lessner completed PhD

read more ...

Tim Lessner was awarded Doctor of Philosophy (Ph.D.)

Tim Lessner was awarded Doctor of Philosophy (Ph.D.) from the University of the West of Scotland, Paisley, in collaboration with Reutlingen University. The title of his thesis is "O|R|P|E - A High Performance Semantic Transaction Model for Disconnected Systems". From the Abstract:"The thesis studies concurrency control and composition of transactions in computing environments for long living transactions where local data autonomy is indispensable".

Contratulations, Dr. Tim!

Best Paper Award at DBKDA 2014

read more ...

Best paper award for
Michael Schaidnagel, Fritz Laux
Feature Construction for Time Ordered Data Sequences
In Proceedings DBKDA 2014 - The Sixth International Conference on Advances in Databases, Knowledge, and Data Applications, pp 1-6, April 20-24, 2014 - Chamonix, France

PC Memberships

read more ...

Members of the DBlab are invited to server serve on the programme comittees of DATA 2014, INTERNET 2014 and WETICE 2014

Paper Accepted at EDBT 2014

read more ...

The paper "SIAS-V in Action: Snapshot Isolation Append Storage - Vectors on Flash" by Robert Gottstein, Thorsten Peter, Ilia Petrov and Alejandro Buchmann has been accepted for publication at the "17th International Conference on Extending Database Technology" (EDBT 2014 - Demonstrations Track), held in Athens, Greece on March 24-28, 2014. [Demonstration Video]

Abstract:

Multi-Version Database Management Systems (MV-DBMS) are wide-spread and can effectively address the characteris- tics of new storage technologies such as Flash, yet they are mainly optimized for traditional storage. A modification of a tuple in a MV-DBMS results in a new version of that item and the invalidation of the old version. Under Snapshot Isolation (SI) the invalidation is performed as an in-place update, which is suboptimal for Flash. We introduce Snap- shot Isolation Append Storage – Vectors (SIAS-V), which avoids the invalidation related updates by organising tuple versions as a simple linked list and by utilizing bitmap vec- tors representing different states of a single version. SIAS-V sequentializes writes and reduces the write-overhead by ap- pending in tuple-version granularity, writing out only com- pletely filled pages, and eliminating in-place invalidation.

In this demonstration we showcase the SIAS-V imple- mentation in PostgreSQL side-to-side with SI. Firstly, we demonstrate that the I/O distribution of PostgreSQL un- der a TPC-C style workload, exhibits a dominant small- sequential write pattern for SIAS-V, as opposed to a ran- dom write dominated pattern under SI. Secondly, we demon- strate how the dense packing of tuple-versions on pages un- der SIAS-V reduces significantly the amount of data written. Thirdly, we show that SIAS-V yields to stable write per- formance and low transaction response times under mixed loads. Last but not least, we demonstrate that SIAS-V also provides performance improvements for traditional HDDs.

Book Chapter to Appear

read more ...

Khalid Nawaz, Ilia Petrov, Alejandro Buchmann. Configurable, Energy-Efficient, Application- and Channel-aware Middleware Approaches for Cyber-Physical Systems. In Zeashan Khan, A. B. M. Shawkat Ali, Zahid Riaz: Computational Intelligence for Decision Support in Cyber-Physical Systems, Studies in Computational Intelligence 540, ISBN 978-981-4585-35-4, Springer , July 2014

Background:

Cyber-Physical Systems represent a class of systems that are composed of computing devices that monitor and control the real world physical processes. The monitoring task requires these devices to be equipped with sensing elements which provide the primary input in the form of raw data to the computing elements of the system. The output of the computing element is generally channeled to the actuation elements of the system for the desired actuation to take place. The actuation serves as the controlling mechanism for the monitored physical process and also closes the sensing-processing-actuation loop. The interaction of these systems with the physical processes introduces some challenges regarding the physical characteristics such as shape, size and robustness of the devices in addition to the more challenging problem of impedance mismatch between the inherently concurrent physical processes and inherently sequential computing processes. In order for these systems to perform monitoring and control functions on the physical processes, networking of the computing elements, generally on an ad hoc basis, is also necessary. In a nutshell, cyber-physical systems are composed of networked embedded computing elements that are equipped with sensing and actuation capabilities so that they can monitor and control physical real-world processes. In Illustration 1 we show a simplified schematic of a typical cyber-physical system indicating the information flows between sensing, processing and actuation parts. It also shows two optional user interface elements, one is used for configuring the devices and the other one to present end users with their desired information. Such user interface components mostly find use in intelligent industrial automation systems.

DBlab Vortrag

read more ...

Dr. Christoph P. Neumann, (EXASOL AG). EXASolution, ein analytisches Datenbanksystem, 100% Made in Germany. Wann: am 21.01.2014 um 9:45Uhr, Raum 9-005


Abstract: Big Data ist in aller Munde und viele Unternehmen ertrinken geradezu in den angesammelten Daten. Doch um der Daten Herr zu werden benötigt man das richtige Werkzeug. EXASOL bietet mit seinem Produkt EXASolution ein massiv-paralleles In-Memory Datenbank Management System (DBMS) zur Spalten-orientierten Verwaltung und SQL-basierten und transaktional geschützten Verarbeitung sehr großer relationaler Datenmengen an. In-Database-Processing erlaubt darüber hinaus auch unstrukturierte oder semi-strukturierte Daten massiv-parallel im DBMS-Cluster zu verarbeiten. Dieser Vortrag wird Einblicke in die Arbeit des deutschen Datenbankherstellers EXASOL und deren Hochleistungsdatenbanksystem EXASolution liefern.

Short Bio:

Herr Dr. Christoph P. Neumann ist technischer Berater bei der EXASOL AG in Nürnberg. Er studierte an der Friedrich-Alexander-Universität Erlangen-Nürnberg und erhielt 2005 sein Diplom in Informatik. Im Anschluss arbeitete er zwei Jahre lang als Software-Ingenieur bei der Capgemini sd[&]m AG in München. Er kehrte 2007 an die Friedrich-Alexander-Universität zurück und provomierte bei Prof. Dr. Richard Lenz am Lehrstuhl für Informatik 6 (Datenmanagement). In Rahmen seiner dortigen Lehrtätigkeiten erhielt er zwei Auszeichnungen für exzellente Lehre. Seine Forschungsinteressen beinhalten adaptiv-evolutionäre Informationssysteme, Prozessmanagement und agile Prozessplanung sowie Methoden der Systemintegration und der verteilten Datensynchronisierung. Seine Promotion erfolgte im November 2012 über verteilte Fallakten im Gesundheitswesen und erhielt die Bestnote "mit Auszeichnung bestanden"; darüber hinaus wurde seine Dissertation für den "GI-Dissertationspreis 2012" nominiert. Seit März 2013 arbeitet er bei der EXASOL, einem IT-Unternehmen mit Sitz in Nürnberg, das die nachweisbar schnellste relationale Datenbank weltweit für analytische Anwendungen anbietet. In seiner Position als technischer Berater verantwortet er die Durchführung von Proof-of-Concept-Projekten mit Kunden in der Presales-Phase.

Paper Accepted at IJAS

read more ...

Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Multi-Version Databases on Flash: Append Storage and Access Paths. In International Journal On Advances in Software, Vol. 6, Number 3 and 4 2013.

DBlab Talk

read more ...

Götz Graefe Ph.D., HP Fellow, Hewlett-Packard Laboratories will give a talk "Instant Recovery From System Failures" on 28.10.2013 at 13:00 in Room 9-039

Title:
Instant Recovery From System Failures

Who:
Götz Graefe, Ph.D., HP Fellow, Hewlett-Packard Laboratories

When:
on 28.10.2013 at 13:00Uhr in room 9-039

Abstract:
Database system failures and the subsequent recovery disrupt many transactions and entire applications, almost always for an extended duration. For those failures, new on-demand "instant" recovery techniques reduce application downtime from minutes or hours to seconds. These new recovery techniques work for databases, file systems, key-value stores, and all other data stores that employ write-ahead logging. Most of the required techniques already exist in many transactional information systems.

Short CV:
Götz Graefe's contributions to database research and product development include query optimization in the Exodus research effort and in the Tandem SQL/MX product, query execution in the Volcano research prototype, and query processing in Microsoft's SQL Server product. In addition to query processing, his work has covered indexing, in particular novel techniques for B-trees, robust performance in query processing, for example a new integrated join algorithm, and transaction support, for example a new scheme for key-range locking. One of his current work streams focuses on database utilities, for example faster backup, restore, and recovery.

Paper Accepted at IDEAS 2013

read more ...

G. Graefe, I. Petrov, T. Ivanov , Veselin Marinov. A hybrid page layout integrating PAX and NSM. In Proc. IDEAS 2013.

Abstract:

Prior work on in-page record formats has contrasted the “N-ary storage model” (NSM) and the “partition attributes across” (PAX) format. The former is the traditional standard page layout whereas the latter “exhibits superior cache and memory bandwidth utilization”, e.g., in data warehouse queries with large scans. Unfortunately, space management within each page is more complex due to the mini-pages in the PAX layout. In contrast, the NSM format simply grows a slot array and the data space from opposite ends of the page until all space is occupied.

The present paper explores a hybrid page layout (HPL) that aims to combine the advantages of NSM and PAX. Predicate evaluation in large scan queries have the same number of cache faults as PAX, and space management uses two data areas growing towards each other. Moreover, the design defines a continuum between NSM and PAX in order to support both efficient scans and efficient insertions and updates. This design is equally applicable to cache lines within RAM memory (the original design goal of PAX) and to small pages on flash storage within large disk pages.

Our experimental evaluation is based on a ShoreMT implementation. It demonstrates that the HPL design scans almost as fast as the scan-optimized PAX layout and updates almost as fast as the update-optimized NSM layout, i.e., it is competitive with both in their best use cases.

Paper Accepted at IDEAS 2013

read more ...

R. Gottstein, I. Petrov, A. Buchmann. Read Optimisations for Append Storage on Flash. In Proc. IDEAS 2013.

Abstract:

Append-/Log-based Storage Managers (LbSM) for database systems represent a good match for the characteristics and behaviour of Flash technology. LbSM alleviate random writes reducing the impact of Flash read/write asymmetry, increas- ing endurance and performance. A recently proposed combi- nation of Multi-Versioning database approaches and LbSM called SIAS [9] offers further benefits: it substantially lowers the write rate due to tuple version append granularity and therefore improves the performance. In SIAS a page con- tains versions of tuples of the same table. Once appended such a page is immutable. The only allowable operations are reads (lookups, scans, version visibility checks) in tuple version granularity. Optimising for them offers an essential performance increase. In the present work-in-progress paper we propose two types of read optimisations: Multi-Version Index and Ordered Log Storage.

Benefits of Ordered Log Storage: (i) Read efficiency due to the use of parallel read streams; (ii) Write efficiency since larger amounts of data are appended sequentially; (iii) fast garbage collection: read multiple sorted runs, filter dead tuples and write one single, large (combined) sorted run. (iv) possible cache-efficiency optimisations (for large scans)

Benefits of Multi-Version Indexing: (i) index only visi- bility checks; (ii) postponing of index reorganisations; (iii) no invalid tuple bits in the index (in-place updates); (iv) pre-filtering of invisible tuple versions; (v) facilitate easy identification of tuple versions to be garbage collected.

Benefits of the combination of both approaches: (i) Index and ordered access; (ii) Facilitate range searches in sorted runs; (iii) on the fly garbage collection (checking of one bit).

Paper Accepted ad ADMS@VLDB13

read more ...

P. Dubs, I. Petrov, R. Gottstein, A. Buchmann. FBARC: I/O Asymmetry Aware Buffer Replacement Strategy. In Proc. ADMS 2013, in Conjunction with VLDB 2013

Abstract:

Buffer Management is central to database systems; it minimizes the access gap between memory and disk. Primary criterion of most buffer management strategies is hitrate maximization (based on recency, frequency). New storage technologies exhibit characteristics such as read/write asymmetry and low read latency. These have significant impact on the buffer manager: due to asymmetry the cost of page eviction may be several times higher than the cost of fetching a page. Hence buffer management strategies for modern storage technologies must consider write-awareness and spatial locality besides hitrate.

In this paper we introduce FBARC - a buffer management strategy designed to address I/O asymmetry on Flash devices. FBARC is based on ARC and extends it by a write list utilizing the spatial locality of evicted pages to produce
semi-sequential write patterns. FBARC adds an additional list to host dirty pages grouping them into fixed regions called clusters based on their disk location. In comparison to LRU, CFLRU, CFDC, and FOR+, FBARC: (i) addresses write-efficiency and endurance; (ii) offers comparatively high hitrate; (iii) is computationally-efficient and uses static grid-based clustering of the page eviction list; (iv) adapts to workload changes; (v) is scan-resistant. Our experimental evaluation compares FBARC against LRU, CFLRU, CFDC, and FOR+ using trace-driven simulation, based on standard benchmark traces (e.g. TPC-C, TPC-H).

PC Memberships

read more ...

Members of the DBlab serve on the program committee of CISIS-2014. (Parallel Computing Track)

Paper Accepted ad GMDS13

read more ...

C. Thies, I. Petrov. Beitrag Hospital Information Systems on High Performance and Energy Efficient Database Systems. In Proc. GMDS 2013. 58. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Lübeck, 01.-05.09.2013. Düsseldorf: German Medical Science GMS Publishing House; 2013.

Current Hospital Information Systems (HIS) face operational challenges such as long downtimes/data inconsistency during system updates, high energy costs, system slow-down during periods of high load and incomplete failover solutions with lots of costly and error prone manual post-processing. A bottleneck for all of these problems is accessing the HIS database and its consistency. To overcome these problems the software architecture (SWA) of HIS currently only uses DBMS specific features. Considering the most frequent types of transactions from a HIS the database itself typically has a monolithic design. Here a higher integration of hardware, DBMS and HIS architecture will increase efficiency, reliability and decrease HIS downtime. This is achievable by using recent development of new memory technologies and affordable hardware as well as a systematic overview of sophisticated database concepts and their specific integration in HIS architectures. This work describes the impact of such new concepts on the main challenges in HIS data management and HIS SWA.

Paper Accepted at SoEA4EE'2013

read more ...

A. Zimmermann, M. Pretz, G. Zimmermann, D. Firesmith, I. Petrov, E. El-Sheikh. Towards Service-oriented Enterprise Architectures for Big Data Applications in the Cloud. In Proc. SoEA4EE'2013 in conjunction with EDOC 2013

Paper Accepted at IJDWM

read more ...

G. Graefe, A. Nica, K. Stolze, T. Neumann, T. Eavis, I. Petrov, E. Pourabbas, D. Fekete. Elasticity in Cloud Databases and Their Query Processing. In International Journal of Data Warehousing and Mining (IJDWM) 9.2 (2013)

ABSTRACT A central promise of cloud services is elastic, on-demand provisioning. For data-intensive services such as data management, growing and shrinking the set of nodes implies copying data to nodes with temporary membership in a service. The provisioning of data on temporarily available nodes is what makes elastic database services a hard problem. At best, a node might retain (not destroy) its copy of the data while it provides another service; at worst, a node that rejoins the database service (or joins for the first time, or joins after a prior failure) requires a new copy of all its assigned data. The essential task that enables elastic data services is bringing a node and its data up-to-date. Strategies for high availability do not satisfy the need in this context because they bring nodes online and up-to-date by repeating history, e.g., by log shipping. We believe that nodes should become up-to-date and useful for query processing incrementally by key range. What is wanted is a technique such that in a newly added node, during each short period of time, an additional small key range becomes up-to-date, until eventually the entire dataset becomes up-to-date and useful for query processing, with overall update performance comparable to a traditional high-availability strategy that carries the entire dataset forward without regard to key ranges. Even without the entire dataset being available, the node is productive and participates in query processing tasks. Our proposed solution relies on techniques from partitioned B-trees, adaptive merging, deferred maintenance of secondary indexes and of materialized views, and query optimization using materialized views. The paper introduces a family of maintenance strategies for temporarily available copies, the space of possible query execution plans and their cost functions, and appropriate query optimization techniques.

Paper Accepted at Data Analytics

read more ...

M. Schaidnagel, I. Petrov, F. Laux. DNA: An Online Algorithm for Credit Card Fraud Detection for Game Merchants. In Proc. Data Analytics 2013

Abstract—Online credit card fraud represents a significant challenge to online merchants. In 2011 alone, the total loss due to credit card fraud amounted to $ 7.60 billion with a clear upward trend. Especially online games merchants have difficulties applying standard fraud detection algorithms to achieve timely and accurate detection. The present paper introduces a novel approach for online fraud detection, called DNA. It is based on a formula which uses attributes that are derived from a sequence of transactions. The influence of these attributes on the result of the formula reveals additional information about this sequence. The result represents a fraud level indicator, serving as a classification threshold. A systematic approach for finding these attributes and the mode of operation of the algorithm is given in detail. The experimental evaluation against several standard algorithms on a real life data set demonstrates the superior fraud detection performance of the DNA approach (16.25% better fraud detection accuracy, 99.59% precision and low response time). In addition to that, several experiments were conducted in order to show the good scalability of the suggested algorithm.

08.08.2013 DFG Flashy-DB Extended

read more ...

The DFG (Deutsche Forschungsgemeinschaft) has extended FlashyDB the research project. DFG FlashyDB is aiming to investigate the influence of Flash Memory on the database architecture, performance and algorithms.

06.08.2013 Program Committee Memberships

read more ...

Members of the DBlab serve on the program committee of SIGMOD 2014 (Demo Track).

DBlab Talk

read more ...

Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland will give talk 'Managing Large Data Volumes Efficiently with IBM Netezza' on 18.06.2013 at 11:30 in 9-003.

Title:
Managing Large Data Volumes Efficiently with IBM Netezza

Who:
Dr. Knut Stolze, Architect IBM DB2 Analytics Accelerator, IBM Deutschland

When:
am Di. 18.06.2013 um 11:30Uhr | Raum 9-003

Abstrakt [PDF]:
Netezza is a highly specialized database management system for data warehousing operations. In this presentation, Dr. Knut Stolze gives an overview of the its system architecture and the internal query processing. It is shown how very good performance can be delivered with a very simple user interface that avoids indexes. Next, the IBM DB2 Analytics Accelerator is an integration project and commercial product that combines the strengths of Netezza's analytic query processing capabilities with DB2's superior OLTP performance. Knut highlights how the integration of both products is achieved in a (nearly) seamless way.

Zum Vortragenden:
Dr. Knut Stolze is working for the Information Management department in the IBM Research [&] Development Lab in Böblingen, Germany. He focuses on relational database systems, specifically large scale data warehouse systems. He gained his expertise and experience in academic and industrial research and in product development. His current research efforts focus on enterprise data warehouse systems, in particular technologies like in-memory, specialty hardware for high performance query processing, and database federation. Knut Stolze is a senior software developer and master inventor at IBM. In his role as an architect, he is responsible for the design and implementation of the IBM DB2 Analytics Accelerator for z/OS. Prior to the current project, Dr. Stolze worked in the DB2 Spatial Extender development team, earned his PhD at the University of Jena, Germany, in 2006, and subsequently moved on the the DB2 z/OS Utilities development.

Accepted at VLDB 2013

read more ...

S. Hardock, I. Petrov, R. Gottstein, A. Buchmann. NoFTL: Database Systems on FTL-less Flash Storage. VLDB 2013 (Demonstrations Track). Riva del Garda, August 26-31, 2013. [Demonstration Video]

Abstract

The database architecture and workhorse algorithms have been designed to compensate for hard disk properties. The I/O characteristics of Flash memories have significant impact on database systems and many algorithms and approaches taking advantage of those have been proposed recently. Nonetheless on system level Flash storage devices are still treated as HDD compatible block devices, black boxes and fast HDD replacements. This backwards compatibility (both software and hardware) masks the native behaviour, incurs significant complexity and decreases I/O performance, making it non-robust and unpredictable. Database systems have a long tradition of operating directly on RAW storage natively, utilising the physical characteristics of storage media to improve performance.

In this paper we demonstrate an approach called NoFTL that goes a step further. We show that allowing for native Flash access and integrating parts of the FTL functionality into the database system yields significant performance increase and simplification of the I/O stack. We created a real-time data-driven Flash emulator and integrated it accordingly into Shore-MT. We demonstrate a performance improvement of up to 3.7x compared to Shore-MT on RAW block-device Flash storage under various TPC workloads.

Accepted Paper

read more ...

R. Gottstein, I. Petrov, and A. Buchmann. Append storage in multi-version databases on flash. In Proc. of BNCOD 2013. Springer-Verlag, 2013.

DBKDA Papers

read more ...

El-Sheikh, E., Bagui, S., Firesmith, D.G., Petrov, I., Wilde, N., Zimmermann, A.: Towards Semantic-Supported SmartLife System Architectures for Big Data Services in the Cloud. In Proc. Service Computation'13, (2013)

PC Memberships

read more ...

iiWAS2013 and ACM PIKM 2013

Members of the DBlab serve on the program comittees of iiWAS2013 and PIKM 2013, at the ACM CIKM 2013.

DBlab Technical Report

read more ...

G. Graefe, I. Petrov, T. Ivanov, V. Marinov. A hybrid page layout integrating PAX and NSM. Technical Report (HPL-2012-240). 2012

A technical report (HPL-2012-240) entitled 'A hybrid page layout integrating PAX and NSM' has been published as a cooperation between Hewlett-Packard Laboratories, DBlab, Reutlingen University, DVS, Technical University Darmstadt.
The report available online under: http://www.hpl.hp.com/techreports/2012/HPL-2012-240.html

DBKDA Papers

read more ...

DBKDA Paper Accepted

Robert Gottstein, Ilia Petrov, Alejandro Buchmann. Aspects of Append-Based Database Storage Management on Flash Memories.In Proc.DBKDA 2013.

DBlab Talk

read more ...

Robert Gottstein (Databases and Distributed Systems Group, TU-Darmstadt) will give talk on the influence of new storage technologies on database systems.

Title: Data Intensive Systems on New Storage Technologies[pdf]
When: 13.12.2012 at 13:00
Where: 9-108.

Abstract: [pdf]

As new storage technologies with radically different properties are appearing (Flash and Non-Volatile Memories), a substantial architectural redesign is required if they are to be used efficiently in a high-performance data-intensive system.

\

Multi-Version approaches to database systems (MVCC, SI) are gaining significant importance and become a dominating trend. They not only offer characteristics that meet the requirements of enterprise workloads, but also provide concepts that can effectively address the properties of new storage technologies. Yet version management may produce unnecessary random writes which are suboptimal for the new technologies.

A variant of SI called SI-CV collocates tuple versions, created by a transaction, in adjacent blocks and minimizes random writes at the cost of random reads. Its performance, relative to the original algorithm, in overloaded systems under heavy transactional loads in TPC-C scenarios on Flash SSD storage increases significantly. At high loads that bring the original system into overload, the transactional throughput of SI-CV increases further, while maintaining response times that are multiple factors lower. SI produces a new version of a data item once it is modified. Both the new and the old version are timestamped accordingly, which in many cases results in two independent (physical) update operations, entailing multiple random writes as well as in-place updates. These are also suboptimal for new storage technologies both in terms of performance and endurance.

We claim that the combination of multiversioning and append storage effectively addresses the characteristics of modern storage technologies.Snapshot Isolation Append Storage (SIAS) improves on SI and traditional "page granularity" append based storage managers. It manages versions as simply linked lists (chains) that are adressed by using a virtual tuple ID (VID). In SIAS the creation of a new version implicitly invalidates the old one resulting in an out-of-place write implemented as a logical append eliminating the need for invalidation timestamps. SIAS is coupled to an append-based storage manager, appending units of tuple versions. SIAS indicates up to 4x performance improvement on Flash SSD under TPC-C workload, entailed by a significant write overhead reduction (up to 38x). SIAS achieves better space utilization due to denser version packing per page and allows for better I/O parallelism and up to 4x lower disk I/O execution times. SIAS aids better endurance, due to use of out-of-place writes as appends and write overhead reduction. Compared to traditional page granularity appends, SIAS achives up to 85% higher read throughput and up to 38x write reduction.

DBKDA Paper

read more ...

DBKDA Paper Accepted

DBKDA Paper Accepted

Christian Abele, Michael Schaidnagel, Fritz Laux, Ilia Petrov. Sales Prediction with Parametrized Time Series Analysis. In Proc. DBKDA 2013.

Data Mining Cup 2012

read more ...

27.06.2012

Michael Scheidnagel and Christian Abele students in the Data Managment Lab earn 7th place in the overall ranking for the second assignment of the Data Mining Cup 2012.

DBlab Web Page

read more ...

27.06.2012

19.10.2012

DBlab Web-Page created.