As part of the Collaborative Research Center FONDA, ZIB is involved in the subproject "B4: Exploiting Software-Defined Networks for Efficient Data Management in Next-Generation Data Analysis Workflows".

Running a given Data Analysis Workflow (DAW) on different computational infrastructures than it was developed for often incurs severe performance penalties. One reason is that DAWs are typically designed for specific infrastructures, which leads to hard-coded decisions regarding file locations, file movement, or means of network-based data exchange between tasks. This subproject will investigate the usage of software-defined networks (SDNs) to bring the requirements of the DAW and the capabilities of the underlying physical infrastructure in terms of data access closer together. It thus aims at improving portability and adaptability of DAW execution engines by means of adapting the underlying infrastructure. Technically, it will develop a light-weight declarative specification language for annotating DAWs with their communication and computation demands, which nicely connects to A2 working in the related field of data access pattern. It will furthermore cooperate with A2 on annotations for specifying data access properties and with B1 on the interplay of file placement and scheduling.

Publications

2024
Validity constraints for data analysis workflows Future Generation Computer Systems, Vol.157, pp. 82-97, 2024 Florian Schintke, Khalid Belhajjame, Ninon De Mecquenem, David Frantz, Vanessa Emanuela Guarino, Marcus Hilbrich, Fabian Lehmann, Paolo Missier, Rebecca Sattler, Jan Arne Sparka, Daniel T. Speckhard, Hermann Stolte, Anh Duc Vu, Ulf Leser BibTeX
DOI
FONDA B4
2023
CAWL: A Cache-aware Write Performance Model of Linux Systems arXiv, pp. 1-22, 2023 Masoud Gholami, Florian Schintke BibTeX
DOI
arXiv
FONDA B4
Proactive Resource Management to Optimize Distributed Workflow Executions IEEE International Conference on Big Data, BigData 2023, Sorrento, Italy, December 15-18, 2023, pp. 6305-6307, 2023 Joel Witzke, Florian Schintke, Ansgar Lößer, Björn Scheuermann BibTeX
DOI
FONDA B4
Validity Constraints for Data Analysis Workflows arXiv, pp. 1-28, 2023 Florian Schintke, Ninon De Mecquenem, Vanessa Emanuela Guarino, Marcus Hilbrich, Fabian Lehmann, Rebecca Sattler, Jan Arne Sparka, Daniel Speckhard, Hermann Stolte, Anh Duc Vu, Ulf Leser BibTeX
DOI
arXiv
FONDA B4
2022
BottleMod: Modeling Data Flows and Tasks for Fast Bottleneck Analysis arXiv, pp. 1-20, 2022 Angar Lößer, Joel Witzke, Florian Schintke, Björn Scheuermann BibTeX
DOI
arXiv
FONDA B4
BottleMod: Modeling Data Flows and Tasks for Fast Bottleneck Analysis 2022 IEEE International Conference on Big Data, Posters, 2022 Angar Lößer, Joel Witzke, Florian Schintke, Björn Scheuermann BibTeX
FONDA B4
IOSIG: Declarative I/O-Stream Properties Using Pragmas Datenbank-Spektrum, 22(2), pp. 109-119, 2022 Masoud Gholami, Florian Schintke PDF
BibTeX
DOI
FONDA B4
Towards Advanced Monitoring for Scientific Workflows 2022 IEEE International Conference on Big Data, 11th Workshop on Scalable Cloud Data Management, 2022 Jonathan Bader, Joel Witzke, Soeren Becker, Ansgar Lößer, Fabian Lehmann, Leon Doehler, Anh Duc Vu, Odej Kao BibTeX
arXiv
FONDA B4
2021
The Collaborative Research Center FONDA Datenbank-Spektrum, 21(3), pp. 255-260, 2021 Ulf Leser, Marcus Hilbrich, Claudia Draxl, Peter Eisert, Lars Grunske, Patrick Hostert, Dagmar Kainmüller, Odej Kao, Birte Kehr, Timo Kehrer, Christoph Koch, Volker Markl, Henning Meyerhenke, Tilmann Rabl, Alexander Reinefeld, Knut Reinert, Kerstin Ritter, Björn Scheuermann, Florian Schintke, Nicole Schweikardt, Matthias Weidlich BibTeX
DOI
FONDA B4