This project is part of the BBDC (Berlin Big Data Center) and focuses on the efficient data management especially for temporary data during high volume data analysis.
We use our distributed file system XtreemFS as a basis and adopt it to the demands of BBDC data analysis.

Publications

2021
Combining XOR and Partner Checkpointing for Resilient Multilevel Checkpoint/Restart 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 277-288, 2021 Masoud Gholami, Florian Schintke BibTeX
DOI
BBDC WP12
2018
From Application to Disk: Tracing I/O Through the Big Data Stack High Performance Computing ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 24 - 28, 2018, Revised Selected Papers, Workshop on Performance and Scalability of Storage Systems (WOPSSS), pp. 89-102, 2018 Robert Schmidtke, Florian Schintke, Thorsten Schütt BibTeX
DOI
BBDC WP12
2016
Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink CUG Proceedings, 2016 Robert Schmidtke, Guido Laubender, Thomas Steinke PDF
BibTeX
BBDC WP12