Wende, Florian

Position:
Guest
Room:
3154
Division:
Department:
Research group:
Mail:
wende
zib.de

Phone:
+49 30 84185 - 323
Fax:
+49 30 84185 - 125
Publications
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2010
2021 |
|||
Jakob Schneck, Martin Weiser, Florian Wende | Impact of mixed precision and storage layout on additive Schwarz smoothers | Numerical Linear Algebra with Applications, 2021 (accepted for publication, preprint available as ZIB-Report 18-62) |
PDF (ZIB-Report)
BibTeX |
2020 |
|||
Samer Alhaddad, Jens Förstner, Stefan Groth, Daniel Grünewald, Yevgen Grynko, Frank Hannig, Tobias Kenter, Franz-Josef Pfreundt, Christian Plessl, Merlind Schotte, Thomas Steinke, Jürgen Teich, Martin Weiser, Florian Wende | HighPerMeshes - A Domain-Specific Language for Numerical Algorithms on Unstructured Grids | Euro-Par 2020, 2020 (accepted for publication) |
BibTeX
|
2019 |
|||
Florian Wende | C++ Data Layout Abstractions through Proxy Types | 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 14th International Workshop on Automatic Performance Tunings (iWAPT), pp. 758-767, 2019 |
BibTeX
DOI |
2018 |
|||
Florian Wende, Martijn Marsman, Jeongnim Kim, Fedor Vasilev, Zhengji Zhao, Thomas Steinke | OpenMP in VASP: Threading and SIMD | International Journal of Quantum Chemistry, p. e25851, 2018 (in press) |
BibTeX
DOI |
2017 |
|||
Matthias Noack, Florian Wende, Georg Zitzlsberger, Michael Klemm, Thomas Steinke | KART - A Runtime Compilation Library for Improving HPC Application Performance | High Performance Computing: ISC High Performance 2017 International Workshops, DRBSD, ExaComm, HCPM, HPC-IODC, IWOPH, IXPUG, P^3MA, VHPC, Visualization at Scale, WOPSSS, Frankfurt, Germany, June 18-22, 2017, Revised Selected Papers, Springer International Publishing, pp. 389-403, 2017 (preprint available as ZIB-Report 16-48) |
PDF (ZIB-Report)
BibTeX DOI |
Zhengji Zhao, Martijn Marsman, Florian Wende, Jeongnim Kim | Performance of Hybrid MPI/OpenMP VASP on Cray XC40 Based on Intel Knights Landing Many Integrated Core Architecture | CUG Conference Proceedings, 2017 |
PDF
BibTeX |
Florian Wende, Martijn Marsman, Zhengji Zhao, Jeongnim Kim | Porting VASP from MPI to MPI+OpenMP [SIMD] | Scaling OpenMP for Exascale Performance and Portability - 13th International Workshop on OpenMP, IWOMP 2017, Stony Brook, NY, USA, September 20-22, 2017, pp. 107-122, Vol.8766, LNCS, 2017 |
BibTeX
DOI |
Helge Knoop, Tobias Gronemeier, Matthias Sühring, Peter Steinbach, Matthias Noack, Florian Wende, Thomas Steinke, Christoph Knigge, Siegfried Raasch, Klaus Ketelsen | Porting the MPI-parallelized LES model PALM to multi-GPU systems and many integrated core processors: an experience report | International Journal of Computational Science and Engineering. Special Issue on: Novel Strategies for Programming Accelerators, 2017 (accepted for publication on 2017-04-29) |
BibTeX
|
2016 |
|||
Olaf Krzikalla, Florian Wende, Markus Höhnerbach | Dynamic SIMD Vector Lane Scheduling | High Performance Computing, ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P^3MA, VHPC, WOPSSS, pp. 354-365, Vol.9945, LNCS, 2016 |
BibTeX
DOI |
Florian Wende, Martijn Marsman, Thomas Steinke | On Enhancing 3D-FFT Performance in VASP | CUG Proceedings, 2016 |
PDF
BibTeX |
Florian Wende, Matthias Noack, Thomas Steinke, Michael Klemm, Georg Zitzlsberger, Chris J. Newburn | Portable SIMD Performance with OpenMP* 4.x Compiler Directives | Pierre-Francois Dutot, Denis Trystram (Eds.), Vol.Euro-Par 2016: Parallel Processing: 22nd International Conference on Parallel and Distributed Computing, LNCS, 2016, ISBN: 978-3-319-43659-3 |
BibTeX
DOI |
2015 |
|||
Florian Wende, Matthias Noack, Thorsten Schütt, Stephen Sachs, Thomas Steinke | Application Performance on a Cray XC30 Evaluation System with Xeon Phi Coprocessors at HLRN-III | Cray User Group, 2015 |
BibTeX
|
Matthias Noack, Florian Wende, Klaus-Dieter Oertel | OpenCL: There and Back Again | High Performance Parallelism Pearls, James Reinders, Jim Jeffers (Eds.), Morgan Kaufman, Elsevier, pp. 355-378, 2015, ISBN: 978-0-12-803819-2 |
BibTeX
|
Florian Wende | SIMD Enabled Functions on Intel Xeon CPU and Intel Xeon Phi Coprocessor | ZIB-Report 15-17 |
PDF
BibTeX URN |
Florian Wende, Thomas Steinke, Alexander Reinefeld | The Impact of Process Placement and Oversubscription on Application Performance: A Case Study for Exascale Computing | Proceedings of the 3rd International Conference on Exascale Applications and Software, EASC 2015, A. Gray, L. Smith, M. Weiland (Eds.), pp. 13-18, 2015, ISBN: 978 -0-9 926615 -1-9 (preprint available as ZIB-Report 15-05) |
PDF
PDF (ZIB-Report) BibTeX |
2014 |
|||
Matthias Noack, Florian Wende, Thomas Steinke, Frank Cordes | A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi Clusters | SC '14: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. SC14, November 16-21, 2014, New Orleans, Louisiana, USA, 2014 |
BibTeX
DOI |
Florian Wende, Frank Cordes, Thomas Steinke | Concurrent Kernel Execution on Xeon Phi within Parallel Heterogeneous Workloads | Euro-Par 2014: Parallel Processing. 20th International Conference, Porto, Portugal, August 25-29, 2014, Proceedings, pp. 788-799, Vol.8632, Lecture Notes in Computer Science, 2014 |
BibTeX
DOI |
Florian Wende, Thomas Steinke, Michael Klemm, Alexander Reinefeld | Concurrent Kernel Offloading | High Performance Parallelism Pearls, James Reinders, Jim Jeffers (Eds.), Morgan Kaufman, Elsevier, 2014, ISBN: 978-0128021187 (in press) |
BibTeX
|
Florian Wende, Guido Laubender, Thomas Steinke | Integration of Intel Xeon Phi Servers into the HLRN-III Complex: Experiences, Performance and Lessons Learned | CUG2014 Proceedings, 2014 (preprint available as ZIB-Report 14-15) |
PDF
PDF (ZIB-Report) BibTeX |
Florian Wende, Thomas Steinke, Frank Cordes | Multi-threaded Kernel Offloading to GPGPU Using Hyper-Q on Kepler Architecture | ZIB-Report 14-19 |
PDF
BibTeX URN |
2013 |
|||
Florian Wende | Dynamic Load Balancing on Massively Parallel Computer Architectures | Bachelor's thesis, Freie Universität Berlin, Helmut Alt, Alexander Reinefeld, Thomas Steinke (Advisors), 2013 |
PDF
BibTeX URN |
Florian Wende, Thomas Steinke | Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU | Proceeding SC '13 Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis Article No. 83 ACM New York, NY, USA, 2013, 2013 (preprint available as ZIB-Report 13-44) |
PDF (ZIB-Report)
BibTeX DOI |
2012 |
|||
Florian Wende, Frank Cordes, Thomas Steinke | On Improving the Performance of Multi-threaded CUDA Applications with Concurrent Kernel Execution by Kernel Reordering | Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on, pp. 74-83, 2012 |
BibTeX
DOI |
2010 |
|||
Florian Wende | Simulation of Spin Models on Nvidia Graphics Cards using CUDA | Master's thesis, Humboldt-Universität zu Berlin, Ulrich Wolff, Michael Müller-Preussker, Hinnerk Stüben (Advisors), 2010 |
BibTeX
|