hello-world.nf meets GitHub Actions

Automate testing and executing hello-world.nf using custom inputs with GitHub Actions

Mhoroi nyika!* IIt’s been more than a quarter since I learned Nextflow back in March 2024, and I’ve been wanting to combine continuous integration (CI) processes with it. So, it’s time to revisit the Hello World Nextflow tutorial. I’ve added a YAML file to that repo to automate running the workflow with custom inputs and save the resulting artifacts.

Overview

The YAML file is designed to automate a series of tasks using GitHub Actions to run a Nextflow workflow with specific input parameters. I’ve set it up so that it is triggered manually via the workflow_dispatch event and performs the following key steps:

  1. Checkout the repository code.
  2. Set up Java Development Kit (JDK).
  3. Set up and install Nextflow.
  4. Run a Nextflow workflow with custom inputs.
  5. Upload the resulting output files as artifacts for later inspection.

Explanation of key sections of the YAML file

on: workflow_dispatch
  • on: This specifies the event that triggers the workflow. workflow_dispatch allows the workflow to be manually triggered from the GitHub Actions tab in your repository.

Jobs Definition

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        java: ['11']
        nextflow: ['latest-edge', 'latest-stable' ]
  • jobs: This section defines the jobs to run as part of the workflow. Here, we have one job named test.
  • runs-on: Specifies the environment in which the job runs. ubuntu-latest indicates that the job runs on the latest version of Ubuntu provided by GitHub Actions.
  • strategy: The matrix strategy allows the workflow to run multiple combos of configurations. Here it runs the workflow using different combos of Java versions and Nextflow versions.
    • java: Specifies the versions of Java to test. Here, Java 11 is being used. I could specify other versions Java as well.
    • nextflow: Specifies the versions of Nextflow to test, using both latest-edge and latest-stable versions.

Workflow Steps

Step 1: Checkout Code

    steps:
    - name: Checkout code
      uses: actions/checkout@v4
  • steps: Defines the series of steps that the job will execute.
  • uses: actions/checkout@v4 Jobs defined in a GitHub Actions workflow are executed in a virtual machine (runner). To ensure the runner has access to the code for testing and execution, we use the actions/checkout action to retrieve the repository’s contents.

Step 2: Install JDK

    - name: Install JDK ${{ matrix.java }}
      uses: actions/setup-java@v1
      with:
          java-version: ${{ matrix.java }}
  • name: Provides a name for the step. We use the Java version from the matrix.
  • uses: actions/setup-java@v1 This action sets up the Java Development Kit (JDK) in the environment. The version of Java is determined by the matrix configuration (${{ matrix.java }}), which in this case is Java 11.

Step 3: Setup Nextflow

    - name: Setup Nextflow ${{ matrix.nextflow }}
      uses: nf-core/setup-nextflow@v1
      with:
          version: "${{ matrix.nextflow }}"
  • name: Describes the step as setting up Nextflow, and it dynamically includes the version from the matrix.
  • uses: nf-core/setup-nextflow@v1 This action installs the specified version of Nextflow, either the latest-edge or latest-stable version.

Step 4: Run Nextflow Workflow with Custom Inputs

    - name: Run Nextflow Workflow with Custom Inputs
      run: |
        echo "mhoroi" > greetings.txt
        echo "shona" > languages.txt
        nextflow run hello-world.nf --input_file "greetings.txt" --lang_file "languages.txt" --output_file "results.txt"  --ansi_log false
  • name: Indicates that this step runs the Nextflow workflow with custom inputs.
  • run: This step runs a series of shell commands:
    • echo "mhoroi" > greetings.txt creates a file named greetings.txt containing the word "mhoroi".
    • echo "shona" > languages.txt creates a file named languages.txt containing the word "shona".
    • nextflow run hello-world.nf --input_file "greetings.txt" --lang_file "languages.txt" --output_file "results.txt" --ansi_log false runs the Nextflow workflow defined in hello-world.nf, using greetings.txt and languages.txt as inputs and writing the output to results.txt. The -ansi_log false flag disables ANSI colors in the log output for easier readability in environments that don’t support ANSI colors.

Step 5: Upload Output Files as Artifacts

    - name: Upload output files as artifacts
      uses: actions/upload-artifact@v4
      with:
        name: nextflow-outputs
        path: '**/upper-shona-results.txt'
        if-no-files-found: warn
  • name: Describes the step as uploading output files.
  • uses: actions/upload-artifact@v4 This action uploads the specified file as artifacts, which can be downloaded from the GitHub Actions interface after the workflow is completed.
  • with: Provides additional parameters for the action:
    • name: Names the artifact nextflow-outputs.
    • path: Specifies the path to the file to be uploaded. The wildcard **/upper-shona-results.txt is used to search for upper-shona-results.txt in any directory.
    • if-no-files-found: warn Issues a warning if no files are found matching the path, rather than failing the workflow.

Saving artifacts is useful for keeping track of how the pipeline's output changes after new code or edits are introduced and for debugging purposes if something goes wrong.

By clicking the number under ‘Artifacts’ I could download the output file ‘upper-shona-results.txt’. Note that the output file name must be correctly specified in
By clicking the number under ‘Artifacts’ I could download the output file ‘upper-shona-results.txt’. Note that the output file name must be correctly specified in path in the YAML file.

Key Takeaways

This workflow can be particularly useful for continuous integration (CI) scenarios where you want to test your Nextflow scripts against different environments, however, it is just a sandbox for me to learn how GitHub actions work. I have not used nf-core before - at least none of my work so far involves using it. I believe that regular nextflow users utilize templates to streamline the development of Nextflow-based bioinformatics pipelines.

Also - running a workflow on test data from start to finish is not necessarily best practice for workflow testing. There are several reasons why relying solely on high-level tests is not ideal. For example, high-level tests might fail in edge cases or may not be able to pinpoint exactly which part of the workflow is causing the failure. In my case, my first few runs failed because I used the wrong symbols for comments, and my test wasn’t granular enough so there were many possibilities I had to consider before narrowing down and figuring out what caused the problem.

To address these challenges, the nf-test package provides a framework that makes it easy to add both module-level and workflow-level tests to existing pipelines, ensuring more comprehensive and effective testing.

*Mhoroi is a native Shona word that simply means greetings or hello when translated to English. Shona is a Bantu language of the Shona people of Zimbabwe.