nf-lakefs @ 0.6.1
Summary
This plugin integrates Nextflow with lakeFS, allowing you to use lakefs:// URIs as inputs and outputs in your pipelines. It brings the power of data versioning, reproducibility, and atomic operations to your Nextflow workflows. The plugin treats lakeFS repositories and branches as a native file system, enabling seamless data management without requiring manual pre-loading or post-processing steps.
Key Features:
- Native URI Support: Use
lakefs://<repo>/<branch>/<path>URIs directly in your Nextflow scripts andnextflow.config - Read Inputs: Stage input files from a lakeFS repository for use in your processes
- Publish Outputs: Publish output files from your processes directly to a lakeFS repository branch
- Transparent Authentication: Securely configure your lakeFS credentials
- Multiple Transfer Modes: Choose between
signed_urlorphysical_pathmodes for data access - Auto-Create Branches: Optionally create new branches automatically when writing to non-existent branches
Get Started
Installation
Add the following to your nextflow.config file. Replace <version> with the desired plugin version (e.g., 0.1.0):
plugins {
id 'nf-lakefs@<version>'
}
Configuration
Configure the plugin with your lakeFS server details and credentials in your nextflow.config:
lakefs {
apiUrl = 'https://your-lakefs-server.example.com/api/v1'
accessKey = 'YOUR_LAKEFS_ACCESS_KEY'
secretKey = 'YOUR_LAKEFS_SECRET_KEY'
transferMode = 'signed_url'
autoCreateBranch = false
autoCreateBranchSource = 'main'
}
Configuration Options:
| Option | Required | Default | Description |
|---|---|---|---|
apiUrl |
Yes | - | lakeFS API endpoint URL (must end with /api/v1) |
accessKey |
Yes | - | lakeFS access key |
secretKey |
Yes | - | lakeFS secret key |
transferMode |
No | signed_url |
Transfer mode: signed_url or physical_path |
autoCreateBranch |
No | false |
Automatically create branch if it doesn't exist when writing |
autoCreateBranchSource |
No | main |
Source branch to create new branches from |
readTimeout |
No | 60s |
HTTP read timeout for lakeFS API calls |
connectTimeout |
No | 30s |
HTTP connect timeout for lakeFS API calls |
allowedSchemaBucketsForLinking |
No | [] |
Whitelist of cloud storage scheme+bucket prefixes allowed as link sources |
Transfer Modes
The plugin supports two modes for reading and writing data:
- signed_url - The plugin requests a signed URL for file access from lakeFS or generates a cloud provider storage signed URL for write requests
- physical_path - The plugin requests the cloud provider specific physical path (s3:// or gs://) and delegates the request to the relevant Nextflow plugin. Ensure your Nextflow process has permissions to use the cloud provider API requests
Advanced Configuration: Auto-Create Branch
When autoCreateBranch is enabled, the plugin automatically creates a new branch if it doesn't exist when you attempt to write to it:
lakefs {
apiUrl = 'https://your-lakefs-server.example.com/api/v1'
accessKey = 'YOUR_LAKEFS_ACCESS_KEY'
secretKey = 'YOUR_LAKEFS_SECRET_KEY'
autoCreateBranch = true
autoCreateBranchSource = 'main'
}
Writing to lakefs://my-repo/new-branch/file.txt will automatically create new-branch from main if it doesn't already exist.
AWS Configuration (for physical_path mode with S3)
When using physical_path with an S3-backed lakeFS repository, also configure your AWS credentials:
aws {
accessKey = '<AWS_ACCESS_KEY_ID>'
secretKey = '<AWS_SECRET_ACCESS_KEY>'
region = 'eu-central-1'
}
allowedSchemaBucketForLinking Configuration
When using java.nio.file.Files.createLink() or Link mode on publishing, whitelist cloud storage scheme+bucket combinations allowed to be treated as equivalent backends:
lakefs {
allowedSchemaBucketForLinking = [
'gs://bucket1',
'gs://bucket2'
]
}
For S3-backed repos:
lakefs {
allowedSchemaBucketForLinking = [
's3://bucket1',
's3://bucket2'
]
}
Examples
Basic Pipeline with lakeFS Input and Output
This example reads a configuration file from a lakeFS repository, processes it, and publishes the result back to the repository:
main.nf
params.input = 'lakefs://my-repo/main/path/to/**/config.yaml'
params.output_dir = 'lakefs://my-repo/output/path/to/'
process EXAMPLE_PROCESS {
input:
path my_file
output:
path 'result.txt'
script:
"""
echo "Processing ${my_file}..."
cat "${my_file}" | wc -l > result.txt
"""
}
workflow {
Channel.fromPath(params.input, type: 'file')
| view
| EXAMPLE_PROCESS
}
Run the pipeline:
nextflow run main.nf
License
This Nextflow plugin is currently published for read-only usage.
| Nextflow version | >=25.10.6 |
|---|---|
| Depends On | - |
| Release Date | 28 Jun 2026 21:04:38 (UTC) |
| Release Notes | - |
| Download URL | https://registry.nextflow.io/api/v1/plugins/nf-lakefs/0.6.1/download/nf-lakefs-0.6.1.zip |
| Store URL | https://public.cr.seqera.io/v2/nextflow/plugin/nf-lakefs/blobs/sha256:9ec2dbae2ed82024a9a1b6882881bfcd03aa07b797f6036ed7003d1249ae2a76 |
| Size | 7.0 MB |
| Checksum | cb4fc5b310d84800db8751b25793c529eefebcb99755767bd57b0737c7201838d413b074d6898324fa7be62883c5375bfb2c64c4b63e1006876163d5be23642b |
| Total downloads | 2 View trends |
| Security Scan |
| Version | Nextflow version | Date | Status | Downloads |
|---|---|---|---|---|
| 0.6.1 | >=25.10.6 | 28 Jun 2026 21:04:38 (UTC) | 2 | |
| 0.6.0 | >=25.10.4 | 25 Jun 2026 23:32:42 (UTC) | 1 | |
| 0.5.0 | >=25.04.0 | 03 Jun 2026 10:40:08 (UTC) | 565 | |
| 0.4.1 | >=25.04.0 | 08 Feb 2026 15:09:40 (UTC) | 680 | |
| 0.4.0 | >=25.04.0 | 07 Feb 2026 23:26:44 (UTC) | 464 | |
| 0.3.4 | >=25.04.0 | 06 Feb 2026 18:30:33 (UTC) | 4 | |
| 0.3.3 | >=25.04.0 | 20 Jan 2026 21:24:16 (UTC) | 291 | |
| 0.3.2 | >=25.04.0 | 20 Jan 2026 20:29:37 (UTC) | 4 | |
| 0.3.1 | >=25.04.0 | 22 Dec 2025 18:27:40 (UTC) | 395 | |
| 0.3.0 | >=25.04.0 | 16 Dec 2025 22:36:36 (UTC) | 83 | |
| 0.2.2 | >=25.10.0 | 06 Dec 2025 17:23:47 (UTC) | 160 | |
| 0.2.1 | >=25.04.0 | 03 Nov 2025 20:18:00 (UTC) | 97 | |
| 0.2.0 | >=25.04.0 | 08 Sep 2025 22:03:34 (UTC) | 48 |