×

nf-lakefs @ 0.6.1

Provider: cytoreason
Claimed: 07 Sep 2025 21:37:16 (UTC)
Description: This plugin integrates Nextflow with lakeFS, enabling native `lakefs://` URI support for reading inputs and publishing outputs directly to lakeFS repositories and branches. It solves data versioning, reproducibility, and atomic operations challenges in bioinformatics and data processing workflows. Data scientists and bioinformaticians who use Nextflow would benefit from seamless data management without manual pre-loading or post-processing steps.
Latest version: 0.6.1
Total downloads: 2.8K View trends

Summary

This plugin integrates Nextflow with lakeFS, allowing you to use lakefs:// URIs as inputs and outputs in your pipelines. It brings the power of data versioning, reproducibility, and atomic operations to your Nextflow workflows. The plugin treats lakeFS repositories and branches as a native file system, enabling seamless data management without requiring manual pre-loading or post-processing steps.

Key Features:

  • Native URI Support: Use lakefs://<repo>/<branch>/<path> URIs directly in your Nextflow scripts and nextflow.config
  • Read Inputs: Stage input files from a lakeFS repository for use in your processes
  • Publish Outputs: Publish output files from your processes directly to a lakeFS repository branch
  • Transparent Authentication: Securely configure your lakeFS credentials
  • Multiple Transfer Modes: Choose between signed_url or physical_path modes for data access
  • Auto-Create Branches: Optionally create new branches automatically when writing to non-existent branches

Get Started

Installation

Add the following to your nextflow.config file. Replace <version> with the desired plugin version (e.g., 0.1.0):

plugins {
    id 'nf-lakefs@<version>'
}

Configuration

Configure the plugin with your lakeFS server details and credentials in your nextflow.config:

lakefs {
    apiUrl    = 'https://your-lakefs-server.example.com/api/v1'
    accessKey = 'YOUR_LAKEFS_ACCESS_KEY'
    secretKey = 'YOUR_LAKEFS_SECRET_KEY'
    transferMode = 'signed_url'
    autoCreateBranch = false
    autoCreateBranchSource = 'main'
}

Configuration Options:

Option Required Default Description
apiUrl Yes - lakeFS API endpoint URL (must end with /api/v1)
accessKey Yes - lakeFS access key
secretKey Yes - lakeFS secret key
transferMode No signed_url Transfer mode: signed_url or physical_path
autoCreateBranch No false Automatically create branch if it doesn't exist when writing
autoCreateBranchSource No main Source branch to create new branches from
readTimeout No 60s HTTP read timeout for lakeFS API calls
connectTimeout No 30s HTTP connect timeout for lakeFS API calls
allowedSchemaBucketsForLinking No [] Whitelist of cloud storage scheme+bucket prefixes allowed as link sources

Transfer Modes

The plugin supports two modes for reading and writing data:

  1. signed_url - The plugin requests a signed URL for file access from lakeFS or generates a cloud provider storage signed URL for write requests
  2. physical_path - The plugin requests the cloud provider specific physical path (s3:// or gs://) and delegates the request to the relevant Nextflow plugin. Ensure your Nextflow process has permissions to use the cloud provider API requests

Advanced Configuration: Auto-Create Branch

When autoCreateBranch is enabled, the plugin automatically creates a new branch if it doesn't exist when you attempt to write to it:

lakefs {
    apiUrl    = 'https://your-lakefs-server.example.com/api/v1'
    accessKey = 'YOUR_LAKEFS_ACCESS_KEY'
    secretKey = 'YOUR_LAKEFS_SECRET_KEY'
    autoCreateBranch = true
    autoCreateBranchSource = 'main'
}

Writing to lakefs://my-repo/new-branch/file.txt will automatically create new-branch from main if it doesn't already exist.

AWS Configuration (for physical_path mode with S3)

When using physical_path with an S3-backed lakeFS repository, also configure your AWS credentials:

aws {
    accessKey = '<AWS_ACCESS_KEY_ID>'
    secretKey = '<AWS_SECRET_ACCESS_KEY>'
    region    = 'eu-central-1'
}

allowedSchemaBucketForLinking Configuration

When using java.nio.file.Files.createLink() or Link mode on publishing, whitelist cloud storage scheme+bucket combinations allowed to be treated as equivalent backends:

lakefs {
    allowedSchemaBucketForLinking = [
            'gs://bucket1',
            'gs://bucket2'
    ]
}

For S3-backed repos:

lakefs {
    allowedSchemaBucketForLinking = [
            's3://bucket1',
            's3://bucket2'
    ]
}

Examples

Basic Pipeline with lakeFS Input and Output

This example reads a configuration file from a lakeFS repository, processes it, and publishes the result back to the repository:

main.nf

params.input = 'lakefs://my-repo/main/path/to/**/config.yaml'
params.output_dir = 'lakefs://my-repo/output/path/to/'

process EXAMPLE_PROCESS {

    input:
    path my_file

    output:
    path 'result.txt'

    script:
    """
    echo "Processing ${my_file}..."
    cat "${my_file}" | wc -l > result.txt
    """
}

workflow {
    Channel.fromPath(params.input, type: 'file')
            | view
            | EXAMPLE_PROCESS
}

Run the pipeline:

nextflow run main.nf 

License

This Nextflow plugin is currently published for read-only usage.

Nextflow version >=25.10.6
Depends On -
Release Date 28 Jun 2026 21:04:38 (UTC)
Release Notes -
Download URL https://registry.nextflow.io/api/v1/plugins/nf-lakefs/0.6.1/download/nf-lakefs-0.6.1.zip
Store URL https://public.cr.seqera.io/v2/nextflow/plugin/nf-lakefs/blobs/sha256:9ec2dbae2ed82024a9a1b6882881bfcd03aa07b797f6036ed7003d1249ae2a76
Size 7.0 MB
Checksum cb4fc5b310d84800db8751b25793c529eefebcb99755767bd57b0737c7201838d413b074d6898324fa7be62883c5375bfb2c64c4b63e1006876163d5be23642b
Total downloads 2 View trends
Security Scan
Version Nextflow version Date Status Downloads
0.6.1 >=25.10.6 28 Jun 2026 21:04:38 (UTC) 2
0.6.0 >=25.10.4 25 Jun 2026 23:32:42 (UTC) 1
0.5.0 >=25.04.0 03 Jun 2026 10:40:08 (UTC) 565
0.4.1 >=25.04.0 08 Feb 2026 15:09:40 (UTC) 680
0.4.0 >=25.04.0 07 Feb 2026 23:26:44 (UTC) 464
0.3.4 >=25.04.0 06 Feb 2026 18:30:33 (UTC) 4
0.3.3 >=25.04.0 20 Jan 2026 21:24:16 (UTC) 291
0.3.2 >=25.04.0 20 Jan 2026 20:29:37 (UTC) 4
0.3.1 >=25.04.0 22 Dec 2025 18:27:40 (UTC) 395
0.3.0 >=25.04.0 16 Dec 2025 22:36:36 (UTC) 83
0.2.2 >=25.10.0 06 Dec 2025 17:23:47 (UTC) 160
0.2.1 >=25.04.0 03 Nov 2025 20:18:00 (UTC) 97
0.2.0 >=25.04.0 08 Sep 2025 22:03:34 (UTC) 48