When mathematicians combine mathematical theory and real-world applications, then mathematical research software complements chalk and blackboard, or pencil and paper. Today, mathematical software plays a central role in research and key technologies and has an increasing impact on mathematical education. 

Logo Open Access

Since its creation more than 30 years ago, ZIB has promoted this combination as a significant extension of scientific work. This has resulted in many software packages, some of which are of outstanding importance to the scientific community. Prominent examples today include SCIP in the area of optimization, KASKADE for numerical mathematics, and AMIRA for scientific visualization and data analysis. An overview of software developed at ZIB can be found here: http://www.zib.de/software.

Software Sustainability is a new working area that has been gaining momentum in the past few years. With the continuing shift toward data-centric science, the software utilized to produce the scientific output is increasingly regarded as an equally important “product of science” itself.

A general understanding is evolving that not only a publication, but also the documented and available datasets should get scientific credit. Moreover, there is a growing consensus in the research-data-management community that research software could be considered as datasets (i.e. research data). Publication and reuse of software is necessary to ensure validity, reproducibility, and to drive further discoveries. Since a common research-data-management practice fosters publication along the principles of Open Access, software should also be made publicly available, ideally corresponding to the FAIR principles of research-data management: data should be Findable, Accessible, Interoperable, and Reusable.  

To transfer these principles to the management of research software is a true challenge, especially because software is characterized by an exceptional variety of dependencies on other entities (hardware, OS, algorithms, etc.), which renders it very difficult to define an encapsulated status as a datum. In addition, software is a living entity that evolves over time. The concept of Software Sustainability involves development, deployment, maintenance, and publication efforts, with the aim to ensure the ongoing functionality of research software. The incorporation of Software Sustainability methods at ZIB is part of increasing efforts to conduct research in accordance with open-science principles.

Software Sustainability at ZIB

Search interface of ZIB OPUS institutional repository

Search interface of ZIB OPUS institutional repository

In order to establish sustainable software development practices, a variety of infrastructure and organizational measures have to be addressed. The following approaches, services, and project results form the basis and starting point for these measures. In recent years there have been several activities to promote research-data management. One of the essential first steps was the formation of a focus group supporting the handling of research data, composed of researchers from different divisions and departments at ZIB. This group has started to identify common needs across all departments and divisions. It acts as a contact point concerning all questions related to research-data management, including the generation of data-management plans. Its most important action item is publication of research data and research software. A twofold strategy is being pursued to increase the visibility of produced software. In addition to the project page (i.e., the product page of the research software, which is maintained by the respective research group), all released versions of a software are intended to be published as research data with the institutional repository OPUS (https://opus4.kobv.de/opus4-zib/home). In this context, the Kooperativer Bibliotheksverbund Berlin-Brandenburg (KOBV), part of ZIB since 1997, provides support for OPUS.

OPUS 4 Logo

 

OPUS is utilized to present landing pages (called front doors in OPUS) for specific software with an additional set of metadata and a download link. The software packages will reside on a dedicated download server. OPUS itself is an open-source product of ZIB, published under a GNU General Public License. Current work on OPUS involves other important features with regard to open-science principles: automatic DOI minting and ORCID implementation.

 toi Logo ORCID Logo

Another aspect of the long-term availability of research software is being addressed by the experimental application of digital preservation techniques. The OAIS-compliant digital preservation system EWIG at ZIB, which is composed of freely available open-source components, is used as a research-software source-code archive, accompanied by extensive metadata. To date, the incorporation of digital preservation strategies within the context of software sustainability has not been straightforward. There is a gap to be closed between software archiving and maintenance approaches on the one hand, and long-term aspects of digital curation of software on the other. 

A further step to achieve software sustainability at ZIB involves the use of the swMATH portal that analyzes mathematical publications for software citations. The aim is to identify the connection of research software and scientific articles that present results based on the use of this software.

swMATH LOGO 

SWMATH – An Open-Access Database for Mathematical Software

How does one find software for a specific mathematical problem? Is there already a solution or an implementation? What is the mathematical background? Who are the authors? What hardware do you need? Is there any documentation? Is it free for educational use or do you need a license? Moreover, it is also a real problem to cite software when writing a scientific paper. Where to find the software? Which version was used? Is it accessible? Can you reproduce the results? 

The project swMATH (www.swmath.org) is an attempt to develop and establish an information service for mathematical software and mathematical research data. It started as a project of Mathematisches Forschungsinstitut Oberwolfach (MFO) and FIZ Karlsruhe, and is presently continued as a project of the Research Campus MODAL at ZIB. swMATH provides information about mathematical software and its mathematical background. It will improve the visibility of software and strengthen the role of software within mathematics. swMATH is focused on software, but also benchmarks, data collections, and manuals are listed.

Central Idea: Publication-Based Approach

The most informative and relevant secondary source for information about mathematical software is the corresponding scientific literature. Therefore, the most complete bibliographic database of mathematical literature, zbMATH (the former Zentralblatt MATH), is used as a basis to extract information about mathematical software. The fact that zbMATH covers almost all mathematical journals with a focus on mathematical software is of crucial importance. 

zbMATH provides a review or a summary, characteristic key phrases, the references lists, and classification of the mathematical subjects and application areas of mathematical publications. Also, information about the authors, the sources, and the language of the publication are presented. Today, the database zbMATH stores the bibliographic data of nearly four million peer-reviewed mathematical publications with an increase of 10,000 items per month. In general, this approach ensures quality control. 

Analyzing zbMATH 

The manual maintenance of Web databases is an effort that is both expensive and time-consuming. Therefore, the use of machine-based methods for the data analysis and content generation has been a significant aspect in the design of the swMATH service from its beginning. Heuristic methods have been developed for identification and to analyze software information in zbMATH entries. The information regarding publications citing a software is aggregated and provides a profile of the software and its context involving use cases, mathematical background, acceptance, life cycle, related software, and a list of publications citing the software. The publication-based approach is the basic step in the swMATH workflow and provides general information about mathematical software. But details, such as source code, versions, the technical environment, license information, documentation, manuals, or installation guides, as well as links to related benchmarks or data collections, are missing.

Web-Based Approach 

This kind of information can be found on the websites related to the software, in repositories, or on portals, which provide information about and access to software for a special subject. Therefore, some concepts for capturing and analyzing further information about software from websites, software repositories, and Internet archives have been developed. In principle, we are faced with the same tasks as in the publication-based approach: identification of the relevant information and analyzing and structuring the information about a software. But instead of publications, we have to extract it from information on the Web.

The swMATH pages represent an online portal to retrieve information regarding mathematical software: they provide general information about the software, namely the profile of a software derived from the publication-based approach and links to other relevant Web resources that contain more detailed information. Each swMATH page has a unique identifier, which can be used for the citation of a software. 

Finding Software in swMATH

swMATH offers several ways to find software. Based on the referenced publications, swMATH analyzes the abstracts and the MSC classification for each software entry. This allows software to be searched by key word (e.g. “integer programming”) or by browsing through the MSC classification scheme (e.g. “90C10”). Additionally, swMATH classifies all entries by types, for example, programming languages, benchmarks, data collections, or Web services. Last but not least, you also have access to documentation and manuals wherever it is accessible.

An important side effect of analyzing research papers is the discovery of related software: if software S1 is referred by paper P1 and paper P2 refers not only to software S1, but also software S2, we conclude that S1 is related to S2. This helps to identify more than one possible software solution for solving a given problem. 

How To Cite Software?

When analyzing zbMATH articles, you will find different citation styles for software. Software companies, repositories, and publishers give different recommendations for citing software or research data. swMATH will close this gap. In cooperation with related communities, e.g. the Software Citation Working Group of the FORCE11 Initiative, we are working on standards for citing software and research data.

Fair_komplexitaet_reduzieren