GCB 2025 - Workshops

Workshops

will take place on Monday, 22 September 2025

Overview:

WS1) Bioinformatics education

WS2) Leveraging Cloud Computing for Bioinformatics: A SimpleVM Workshop Featuring a Metagenomics Use Case

WS3) From a Collection of Scripts to a Pipeline – Writing Nextflow Workflows with nf-core Best Practices

WS4) Datavzrd: Low-Code, Maintenance-Free Visualization and Communication of Tabular Data

WS5) ProteinsPlus–Supporting Structure-Based Design on the Web

WS6) Computational Pangenomics

WS7) Automated metabolic modelling: Building, analysing and simulating genome-scale metabolic models in Python

WS8) Spatial domain identification: computational methods for discovering tissue architecture

Detailed Workshop Programme:

WS1) Bioinformatics education

Organizers: Jan Grau (MLU Halle); Stefan Kurtz (Universität Hamburg); Kay Nieselt (Eberhard Karls Universität Tübingen); Sven Rahmann (Universität des Saarlandes, Saarbrücken); Ralf Zimmer (Ludwig-Maximilians-Universität München)

Participants: max. 30

Description: This workshop shall bring together people involved in bioinformatics education. Currently, bioinformatik.de lists 38 B.Sc. programs with prominent bioinformatics contents (14 with bioinformatics as major topic) and 35 M.Sc. programs (17 bioinformatics major) in Germany. These programs put varying emphasis on certain bioinformatics topics and skills, and have different access requirements. In previous workshops during GCB 2023 and GCB 2024, we collected an overview of bioinformatics B.Sc. and M.Sc. programs in Germany and discussed essential, dispensable and desirable topics in bioinformatics B.Sc. programs as summarized in a workshop synopsis. This year, we would like to follow up on the results of the previous workshop(s) to yield common standards of a B.Sc. bioinformatics. These shall serve as a guideline when developing new or updating existing B.Sc. bioinformatics programs. Such a guideline might – in the long run – play a similar role as the recommendations for Computer Science B.Sc. programs of GI and “Fakultätentag Informatik”. Specifically, we would like to discuss

Which essential topics and skills should be covered by a B.Sc. in bioinformatics with regard to
- Mathematics
- Computer Science, incl. theory and application of Machine Learning / Artificial Intelligence
- Life Sciences
- Core Bioinformatics
How should ECTS/CPs be distributed among these topics? What proportion of core bioinformatics modules should be reached?
What is a reasonable balance between theoretical and practical (wet lab, programming, etc.) courses?
Which proportion of ECTS/CPs should be held flexible to
- allow for setting a university-specific focus/specialization or
- allow students to follow their topics of interest?
What language requirement should be stated, should the B.Sc. be taught (entirely) in English?

Target audience: Persons involved in bioinformatics education on the study program development and/or implementation level, as well as student representatives.

Provisional schedule:

Summary of the results of previous workshops, open questions (Jan Grau, Stefan Kurtz, Kay Nieselt, Sven Rahmann, Ralf Zimmer)
Joint discussion on topics stated above
Drafting a core B.Sc. bioinformatics program

WS2) Leveraging Cloud Computing for Bioinformatics: A SimpleVM Workshop Featuring a Metagenomics Use Case

Organizers: Peter Belmann(1), David Weinholz(1), Viktor Rudko(1), Qiqi Mok(1)

(1) IBG-5: Computational Metagenomics, Institute of Bio- and Geosciences (IBG), Research Center Jülich GmbH, Germany

Participants: max. 30

Participants must command Linux basics and be registered at de.NBI cloud (https://cloud.denbi.de/wiki/registration/#denbi-cloud-access-registration-guide). If you have any questions, please contact

Description: SimpleVM is a self-service platform within the OpenStack-based de.NBI Cloud, designed to simplify access to computational resources for life sciences research. It offers a variety of computational options, including basic data processing, GPU-accelerated machine learning, and cluster computing, all secured by an intrusion prevention system (IPS). SimpleVM also provides pre-configured Virtual Research Environments (VREs) accessible via web browsers or SSH, encompassing integrated development environments (IDEs) and data notebooks.

In this workshop, participants will delve into a metagenomics use case, where they will learn how to scale their analysis using SimpleVM. The workshop is designed to provide both theoretical knowledge and hands-on experience with cloud computing and the advanced features of SimpleVM. Participants will use VREs, SimpleVM Cluster and S3 to search for a genome of interest in the metagenome SRA mirror of the de.NBI Cloud site Bielefeld.

This workshop is tailored for researchers and educators seeking to optimize their computational tasks. Whether you’re a seasoned professional or just starting out, SimpleVM’s intuitive platform and robust features will empower you to achieve more in less time. The only requirement is a basic knowledge of the Linux command line and a de.NBI Cloud account ().

Provisional Schedule:

Introduction to SimpleVM and Cloud Computing
Provide an overview of SimpleVM and cloud computing basics.
Hands-On Session: Starting Your First VM
Guide participants in launching and configuring their first virtual machine on SimpleVM.
Metagenomics Use Case: Practical Application of SimpleVM
Introduce a real-world scenario to demonstrate the application of SimpleVM in metagenomics analysis.
Hands-On Session: Installing and Testing Tools
Equip participants with essential tools needed for the metagenomics use case.
S3 Object Storage: Efficient Data Management in SimpleVM
Explore the usage of S3-compatible object storage within SimpleVM for scalable data management.
Hands-On Session: Searching in SRA Mirror and Scaling Analysis
Practice accessing public datasets and scaling analyses using SimpleVM’s features.
Advanced Features: VRE and Cluster Modes in SimpleVM
Delve into advanced functionalities of SimpleVM, including VREs and Cluster mode.
Hands-On Session: Visualizing Results with VRE and Scaling Analyses Further
Apply advanced techniques to visualize data and extend the scalability of analyses.

WS3) From a Collection of Scripts to a Pipeline – Writing Nextflow Workflows with nf-core Best Practices

Organizers: Mark Polster, Famke Bäuerle & Sven Nahnsen (University of Tuebingen)

Participants: max. 20 - participants must bring their own laptops

Description: Bioinformatics analyses often begin as a set of scattered scripts, but scaling them into reproducible and maintainable workflows can be challenging. In this hands-on workshop we aim to guide you through transforming your scripts into a robust Nextflow pipeline using nf-core components and best practices.

We will cover essential topics such as pipeline structuring, version control, and best practices for collaboration and reproducibility. Whether you’re new to Nextflow or looking to refine your workflow development skills, this workshop will provide practical insights and hands-on experience to help you understand and utilize the nf-core framework for your own research.

Provisional schedule:

nextflow principles
nf-core and its relationship to nextflow
exploration of an example pipeline
hands-on creation of a pipeline
- using the nf-core template
- exploration of available modules
- writing new modules
nf-test for continuous integration

WS4) Datavzrd: Low-Code, Maintenance-Free Visualization and Communication of Tabular Data

Organizers: Johannes Köster (Bioinformatics and Computational Oncology, University of Duisburg-Essen); Felix Wiegand (Bioinformatics and Computational Oncology, University of Duisburg-Essen)

Participants: max. 30 Participants must bring their own laptop and and should bring their own tabular data for the hands-on session.

Description: Tabular data is central to scientific analysis, but effectively communicating and visualizing it can be a challenge. In this hands-on workshop, we introduce Datavzrd, a low-code tool that enables the creation of interactive, portable reports from tabular data without the need for specialized software or server maintenance. Participants will receive an introduction to the core features of Datavzrd, followed by a step-by-step tutorial on how to use the tool for their own data. Attendees are encouraged to bring their own analysis or research data to configure a report tailored to their needs. This session aims to empower both computational and non-computational researchers to easily create and share interactive data visualizations that scale from small tables to large datasets.

Provisional schedule:

Introduction (15 minutes): Overview of Datavzrd and its capabilities.
Tutorial (60 minutes): Step-by-step tutorial based on the official Datavzrd tutorial, where participants will work with the provided example dataset or their own data to generate reports.
Q&A/Hands-on Session (45 minutes): Participants work with their own datasets, with guidance from organizers.

WS5) ProteinsPlus–Supporting Structure-Based Design on the Web

Organizers: Christiane Ehrt, Matthias Rarey (University of Hamburg, ZBH – Center for Bioinformatics, de.NBI – German Network for Bioinformatics Infrastructure)

Participants: max. 30 (participants can follow all instructions on their laptop)

Description: This workshop provides an introduction to analyzing and mining protein structures. It is based on the freely available ProteinsPlus web server (https://proteins.plus) and covers an introduction session and comprehensive hands-on exercises. As the web server enables users to work with protein structures from the Protein Data Bank the AlphaFold Database as well as individually created protein structures, the participants can easily use the hands-on sessions to work with individual protein target structures of interest.

The workshop focuses on analyzing and predicting protein-ligand complexes. It covers crucial steps of structure-based design, starting with a quality analysis of experimentally determined protein-ligand complexes with StructureProfiler and EDIAscorer. The analysis section will focus on searching for mutations with MicroMiner, predicting and characterizing binding sites for unbound structures, such as AlphaFold models, with DoGSite3, and analyzing and depicting protein-ligand complexes in 2D with PoseEdit. Next, we will focus on preparing binding sites and pocket ensembles with Protoss and SIENA. Finally, the participants will apply on-the-fly molecular docking of individually designed small molecules in their binding site of interest.

The hands-on sessions will enable users to follow the instructions with a target example. Otherwise, they could also decide to perform all tasks on a protein kinase, a phosphodiesterase, or a G protein-coupled receptor to cross-check the outcome of their analyses with the provided solutions to the exercises. Alternatively, they can also explore their target of interest.

The participants will learn to perform elaborate structure analyses, predict potential binding sites, explore and mine for protein-ligand interactions, and predict protein-ligand complexes. To this end, they only need basic knowledge of binding sites and protein-ligand interactions to follow the instructions and solve the workshop exercises.

Provisional schedule:

Part 1 - Selecting and Preparing Binding Sites for Structure-Based Design

Binding Site Quality Assessment
Protein-Ligand Complex Comparison for Ensemble Docking
Assigning Hydrogen Atom Positions and Protonation States
Searching for Binding Site Mutations

Part 2 - Detecting Potential Off-Targets, Novel Binding Sites, and Designing Starting Points

Finding Off-Targets through Geometric Mining of the PDB
Predicting Binding Sites for Unbound Protein Structures
Designing Novel Binders Based on the PDB
Discovering Potential Binding Modes of Novel Binding Hypotheses

WS6) Computational Pangenomics

Organizers: Tizian Schulz (Bielefeld University); Jens Stoye (Bielefeld University); Andreas Rempel (Bielefeld University); Peter Heringer (Heinrich Heine University Düsseldorf); Roland Wittler (Bielefeld University)

Participants: max. 20

Participants should have a basic understanding of Linux operating systems to participate in hands-on sessions of the workshop and must bring their own laptop.

Description: Computational pangenomics deals with the joint analysis of all genomic sequences of a species. Further advances in DNA sequencing technologies constantly let more and more genomic sequences become available for many species, leading to an increasing attractiveness of pangenomic studies. Pangenomics approaches have already been successfully applied to various tasks in many research areas.

The focus of this workshop is to give participants an overview and understanding of commonly used pangenomics tools. Besides an introduction into the motivation and theory behind questions from the field of pangenomics, we will look at specific tools (such as panacus, Corer, and SANS) and let the participants explore their usage in hands-on sessions.

Provisional schedule:

Introduction to computational pangenomics
Investigating a pangenome’s diversity with panacus and hands-on
Pangenomic core detection with Corer and hands-on
Phylogenomic reconstruction with SANS and hands-on

WS7) Automated metabolic modelling: Building, analysing and simulating genome-scale metabolic models in Python

Organizers: Carolin Brune (MLU); Gwendolyn O. Döbel (MLU); Prof. Dr. Andreas Dräger (MLU)

Participants: max. 30

Description: Systems biology seeks to understand organisms by modelling them as context-specific systems. One such system, the metabolic network, can be reconstructed using constrained-based modelling techniques. These models contain mathematical principles that simulate cellular behaviour, such as growth, under defined environmental conditions using flux balance analysis. These models are powerful tools to explore and test multiple hypotheses in silico – accelerating research into new drug targets, antibiotics, or biotechnological production strains while saving time and resources.

However, the reconstruction and curation of genome-scale metabolic models involves numerous well-characterised steps, many of which remain challenging to automate reliably.

In this tutorial, participants will be introduced to the principles of genome-scale metabolic modelling and guided through the model curation process using the open-source toolbox refineGEMs. Building on this, the workflow collection SPECIMEN will be used to demonstrate how these steps can be automated to generate higher-quality models with drastically reduced manual effort. The session includes hands-on experiences and practical examples, enabling attendees to directly apply the tools to their research or learn about this type of modelling in general.

Provisional schedule:

1. Session part (09:00 - 11:00 am)

Talk (~30 min): Introduction to systems biology and constraint-based metabolic modelling
Demonstration (~30 min): Using the toolbox refineGEMs for constraint-based metabolic modelling.
This part will provide instructions on how to install the toolbox, how to collect the materials used for the next section and how to use the main functionalities of the toolbox. The demonstration includes running code live to engage and prepare the participants for the next part of this session.
Hands-On Session (~1 h): Working with refineGEMs.
This session will include guided exercises for essential toolbox functionalities using the Jupyter Notebook/Google Colab for quick and easy accessibility. Participants will be able to run numerous functionalities of refineGEMs by themselves, thereby collecting experience in genome-scale metabolic modelling.

2. Session part (time: 11:30 - 12:30 am)
Demonstration (~30 min): Using the workflow collection SPECIMEN for high-quality model reconstruction and curation.
Similar to the previous demonstration in format, this demonstration will focus on installing and working with the different workflows provided by SPECIMEN. This introduction to the tool will also offer some insights on when to use which workflow.
Discussion (~30 min): Time for questions about the tools as well as to discuss their advantages and limits.

WS8) Spatial domain identification: computational methods for discovering tissue architecture

Organizers: Robin Khatri (University Medical Center Hamburg-Eppendorf)

Participants: max. 15

The workshop requires previous introduction to single-cell and/or spatial transcriptomics. Participants should be comfortable programming in Python for basic tasks. Attendees must bring their own laptops.

Description: This workshop will focus on computational approaches for identifying and characterizing spatial domains in single-cell spatial transcriptomics (ST) data. Spatial domains are tissue regions that share similar features, such as similar gene expression profiles and cell type abundances. For analysis of ST data, it is generally necessary to identify these domains to understand their dynamics under different tissue conditions, such as between healthy and disease states.

As ST data is increasingly used in research due to the benefit of in-situ identification of transcripts and cells, several computational approaches for spatial domain identification have been developed. In this workshop, participants will learn methods for unsupervised detection of spatial domains with distinct molecular signatures and understand techniques for biological interpretation of spatial domains along with associated caveats. Through hands-on tutorials, participants will learn about and apply state-of-the-art domain identification algorithms to real spatial transcriptomics datasets.

Github link for materials:

Please check the link below one week before the workshop. https://github.com/robinredX/spatial-workshop-GCB-2025

Provisional schedule:

Introduction to spatial domain identification
Methodological foundations
Tutorial I—Preprocessing, implementation and evaluation metrics
Tutorial II—Analysis, biological interpretation and discussion

Supported by

German Network for Bioinformatics Infrastructure - de.NBI

GBM_webGesellschaft für Biochemie und Molekularbiologie, GBM)