Funder:Netherlands eScience Center (Spearhead Project, Open eScience Call 2021)
Principal Investigators:Anna-Lena Lamprecht (University of Potsdam), Magnus Palmblad (Leiden University Medical Center)


Scientific workflows are systematic and structured processes that researchers use to organize, automate, and document their computational experiments or data analysis tasks. These tasks can be executed sequentially, in parallel, or in a distributed computing environment. Workflows can involve various software tools, libraries, and data sources, allowing researchers to integrate diverse resources and automate the execution of their experiments. Composing purpose-specific workflows from the wealth of available resources can be a tedious and challenging endeavor. In particular, creating optimal workflows for specific data analysis problems requires an interplay of exploring the latest relevant tool combinations and benchmarking selected workflow candidates with reference data to determine the best-performing ones. Due to a lack of adequate tooling, this is currently hardly done systematically. Therefore, many workflows compromise on scientific quality.

To tackle this problem, we have started the Workflomics project. The term “workflomics” combines "workflows" and "omics," following the pattern of other scientific disciplines such as genomics and metabolomics, thus meaning the systematic and large-scale study of workflows. With support by the Netherlands eScience Center (NLeSC), we develop a novel software system facilitating the workflomics idea: Its key contribution will be a new and unique integration of tools and metadata with technologies for automated workflow exploration and benchmarking. The system will provide a much-needed platform for systematic workflow generation and evaluation that complements and can be interfaced with existing state-of-the-art workflow systems.