nextflow-io
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎adr/20260310-seqera-dataset-filesystem.md‎
Lines changed: 136 additions & 0 deletions b/‎adr/20260310-seqera-dataset-filesystem.md‎
Lines changed: 136 additions & 0 deletions
diff --git a/‎changelog.txt‎
Lines changed: 32 additions & 0 deletions b/‎changelog.txt‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎docs/cli.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/cli.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/migrations/26-04.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/migrations/26-04.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/modules/using-modules.md‎
Lines changed: 5 additions & 5 deletions b/‎docs/modules/using-modules.md‎
Lines changed: 5 additions & 5 deletions
@@ -1 +1 @@
-26.03.2-edge
+26.03.3-edge
@@ -0,0 +1,136 @@
+# NIO Filesystem for Seqera Platform Datasets
+
+- Authors: Jorge Ejarque
+- Status: draft
+- Date: 2026-03-10
+- Tags: nio, filesystem, seqera, datasets, nf-tower
+
+Technical Story: Enable Nextflow pipelines to read Seqera Platform datasets as ordinary file paths using `seqera://` URIs.
+
+## Summary
+
+Add a Java NIO `FileSystemProvider` to the `nf-tower` plugin that registers the `seqera://` scheme, allowing pipelines to reference Seqera Platform datasets (CSV/TSV) as standard file paths without manual download steps. The implementation reuses the existing `TowerClient` for all HTTP communication, inheriting authentication and retry behaviour.
+
+## Problem Statement
+
+Nextflow users managing datasets on the Seqera Platform must currently download dataset files manually or through custom scripts before referencing them in pipelines. There is no native integration between Nextflow's file abstraction and the Seqera Platform dataset API. This creates friction in workflows where datasets are the primary input and forces users to handle authentication, versioning, and file staging outside the pipeline definition.
+
+## Goals or Decision Drivers
+
+- Transparent access to Seqera Platform datasets using standard Nextflow file path syntax
+- Reuse of existing nf-tower plugin infrastructure (authentication, HTTP client, retry/backoff)
+- Hierarchical path browsing matching the platform's org/workspace/dataset structure
+- Extensible architecture that can support future Seqera-managed resource types (e.g. data-links)
+- No new plugin or module — feature lives within nf-tower
+
+## Non-goals
+
+- Streaming large datasets — the Platform API does not support streaming; content is fully buffered on download
+- Implementing resource types beyond `datasets` — only the extensible architecture is required
+- Local caching across pipeline runs — Nextflow's standard task staging handles caching
+- Dataset management operations (delete, rename) — the filesystem is read-only in the initial implementation
+
+## Considered Options
+
+### Option 1: Standalone plugin with dedicated HTTP client
+
+A new `nf-seqera-fs` plugin with its own HTTP client configuration and authentication setup.
+
+- Good, because it isolates the filesystem code from the nf-tower plugin
+- Bad, because it duplicates authentication configuration and HTTP client setup
+- Bad, because two separate HTTP clients sharing a refresh token would corrupt each other's auth state
+
+### Option 2: NIO filesystem within nf-tower using TowerClient delegation
+
+Add the filesystem to nf-tower, delegating all HTTP through the existing `TowerClient` singleton via a typed `SeqeraDatasetClient` wrapper.
+
+- Good, because it shares authentication and token refresh with TowerClient
+- Good, because it reuses existing retry/backoff configuration
+- Good, because no new dependencies are needed
+
+### Option 3: Direct HxClient usage within nf-tower
+
+Add the filesystem to nf-tower but use `HxClient` directly rather than going through TowerClient.
+
+- Good, because it gives full control over request construction
+- Bad, because exposing HxClient internals couples the filesystem to implementation details
+- Bad, because token refresh coordination with TowerClient becomes manual
+
+## Solution or decision outcome
+
+Option 2 — NIO filesystem within nf-tower using TowerClient delegation. All HTTP calls go through `TowerClient.sendApiRequest()`, ensuring a single point of authentication and retry logic.
+
+## Rationale & discussion
+
+### Path Hierarchy
+
+The `seqera://` path encodes the Platform's organizational structure directly:
+
+```
+seqera://                                        → ROOT (directory, depth 0)
+  └── <org>/                                     → ORGANIZATION (directory, depth 1)
+        └── <workspace>/                         → WORKSPACE (directory, depth 2)
+              └── datasets/                      → RESOURCE TYPE (directory, depth 3)
+                    └── <name>[@<version>]        → DATASET (file, depth 4)
+```
+
+Each level is a directory except the leaf dataset, which is a file. Version pinning uses an `@version` suffix on the dataset name segment (e.g. `seqera://acme/research/datasets/samples@2`). Without it, the latest non-disabled version is resolved.
+
+### Name-to-ID Resolution
+
+The path uses human-readable names but the Platform API requires numeric IDs. Resolution is built from two API calls at filesystem initialization:
+
+1. `GET /user-info` → obtain `userId`
+2. `GET /user/{userId}/workspaces` → returns all accessible org/workspace pairs
+
+This single source provides both directory listing content and name→ID mapping. Results are cached in `SeqeraFileSystem` with invalidation on write operations. `GET /orgs` is intentionally not used as it returns all platform orgs, not scoped to user membership.
+
+### Component Structure
+
+```
+plugins/nf-tower/src/main/io/seqera/tower/plugin/
+├── fs/                             ← NIO layer
+│   ├── SeqeraFileSystemProvider    ← FileSystemProvider (scheme: "seqera")
+│   ├── SeqeraFileSystem            ← FileSystem with org/workspace/dataset caches
+│   ├── SeqeraPath                  ← Path implementation (depth 0–4)
+│   ├── SeqeraFileAttributes        ← BasicFileAttributes
+│   ├── SeqeraPathFactory           ← PF4J FileSystemPathFactory extension
+│   └── DatasetInputStream          ← SeekableByteChannel over InputStream
+├── dataset/                        ← API client layer
+│   ├── SeqeraDatasetClient         ← Typed HTTP client wrapping TowerClient
+│   ├── DatasetDto                  ← Dataset API response model
+│   ├── DatasetVersionDto           ← Version API response model
+│   ├── OrgAndWorkspaceDto          ← Org/workspace list model
+│   └── WorkspaceOrgDto             ← Workspace/org mapping model
+└── resources/META-INF/services/
+    └── java.nio.file.spi.FileSystemProvider
+```
+
+### Key Design Decisions
+
+1. **TowerClient delegation**: `SeqeraDatasetClient` delegates all HTTP through `TowerFactory.client()` → `TowerClient.sendApiRequest()`. This ensures shared authentication state and avoids the token refresh corruption that would occur with separate HTTP client instances.
+
+2. **One filesystem per JVM**: `SeqeraFileSystemProvider` maintains a single `SeqeraFileSystem` keyed by scheme. This matches the `TowerClient` singleton-per-session pattern.
+
+3. **Read-only initial scope**: The filesystem reports `isReadOnly()=true`. Write support (dataset upload via multipart POST) is deferred to a future iteration.
+
+4. **Download filename constraint**: The Platform API's download endpoint (`GET /datasets/{id}/v/{version}/n/{fileName}`) requires the exact filename from upload time. The implementation always resolves `DatasetVersionDto.fileName` from `GET /datasets/{id}/versions` before constructing the download URL.
+
+5. **Extensible resource types**: The path hierarchy reserves depth 3 for a resource type segment (currently only `datasets`). Adding support for data-links or other resource types requires only a new handler at the directory listing and I/O layers, with no changes to path resolution or authentication.
+
+6. **Thread safety**: `SeqeraFileSystem` cache methods and `SeqeraFileSystemProvider` lifecycle methods are `synchronized`. The filesystem map uses `LinkedHashMap` with external synchronization rather than `ConcurrentHashMap`, matching the low-contention access pattern.
+
+### Limitations
+
+- **No size metadata**: `SeqeraFileAttributes.size()` returns 0 for all paths because the Platform API does not expose content length in dataset metadata.
+- **Single endpoint per JVM**: The filesystem key is scheme-only; concurrent access to different Platform endpoints in the same JVM is not supported.
+
+### Streaming Downloads
+
+Dataset downloads use `TowerClient.sendStreamingRequest()` which calls `HxClient.sendAsStream()` — the response body is returned as an `InputStream` streamed directly from the HTTP connection. This avoids the triple-buffering problem (`String` → `getBytes()` → `ByteArrayInputStream`) that would otherwise consume ~40 MB heap per 10 MB dataset. The `HxClient.sendAsStream()` method goes through the same `sendWithRetry()` path as `sendAsString()`, so retry logic and token refresh are preserved.
+
+## Links
+
+- [Spec](../specs/260310-seqera-dataset-fs/spec.md)
+- [Implementation plan](../specs/260310-seqera-dataset-fs/plan.md)
+- [Data model](../specs/260310-seqera-dataset-fs/data-model.md)
@@ -1,5 +1,37 @@
 NEXTFLOW CHANGE-LOG
 ===================
+26.03.3-edge - 20 Apr 2026
+- Add -files-from option to lint command to avoid ARG_MAX limit (#6858) [5a3cd830c]
+- Add 26.04 migration docs (#7000) [89ec31bbf]
+- Add option to disable printing workflow outputs (#7018) [791bb449c]
+- Allow cloning from local Git repositories when `--offline` (#7035) [0fa6b5dbd]
+- Allow running pipeline from URL and main script path (#6602) [83196d4be]
+- Apply socket timeout to S3 CRT connections (#7024) [6f4a21764]
+- Filter autoLabels to selected workflow-metadata fields (#7049) [ddc974fe6]
+- Fix S3FileSystemProvider.newInputStream() draining full object on close (#7046) [cf3867604]
+- Fix formatting issues with complex expressions (#7027) [ce661d1d8]
+- Fix generated process name in `module create` command (#7008) [f3d8de796]
+- Fix inconsistent indentation in nf-amazon (#7047) [df6855d7d]
+- Fix module info formatting separator (#7033) [44dff8fcc]
+- Fix nextflowVersion for nf-tower and nf-seqera plugins [cbc0a2d8e]
+- Fix resolution of `-with-tower` with `TOWER_API_ENDPOINT` (#7045) [ce962e882]
+- Fix saveCacheFiles early return skipping log file uploads (#7015) [6fb704838]
+- Fusion GPU metrics collection (#7022) [6289635b8]
+- Honour process.resourceLabels in nf-seqera executor (#7048) [979f684ff]
+- Manage AWS SDK exceptions to convert to the appropriate IO exceptions (#6707) [39c755663]
+- Rename `module info` subcommand to `module view` (#7052) [7fa1109aa]
+- Resolve structured process input types (#7014) [583935d88]
+- Simplify demo module README template (#7051) [6d04c9ebc]
+- Suppress lint progress logging with `-q` flag (#6880) [61793bb6e]
+- Update missing pf4j updates (#7016) [f38f0067d]
+- Use Fusion trace metrics to replace bash command-trace wrapper (#7041) [de4376649]
+- Bump org.bouncycastle:bcpkix-jdk18on from 1.79 to 1.84 (#7042) [59d847d52]
+- Bump nf-amazon@3.8.3
+- Bump nf-k8s@1.5.2
+- Bump nf-seqera@0.18.0
+- Bump nf-tower@1.25.0
+- Bump nf-wave@1.19.1
+
 26.03.2-edge - 7 Apr 2026
 - Add `module create` subcommand (#6992) [d6639a5e0]
 - Add `module spec` command (#6859) [049e2a40e]
 
@@ -323,19 +323,19 @@ See {ref}`cli-module-list` for more information.
 
 ### Viewing module information
 
-The `module info` command displays detailed metadata and usage information for a specific module from the registry.
+The `module view` command displays detailed metadata and usage information for a specific module from the registry.
 
 Use this to understand module requirements, view input/output specifications, see available tools, or generate usage templates before installing or running a module.
 
 ```console
-$ nextflow module info nf-core/fastqc
-$ nextflow module info nf-core/fastqc -version 1.0.0
-$ nextflow module info nf-core/fastqc -output json
+$ nextflow module view nf-core/fastqc
+$ nextflow module view nf-core/fastqc -version 1.0.0
+$ nextflow module view nf-core/fastqc -output json
 ```
 
 The output includes the module's version, description, authors, keywords, tools, input/output channels, and a generated usage template showing how to run the module. Use `-json` for machine-readable output suitable for programmatic access.
 
-See {ref}`cli-module-info` for more information.
+See {ref}`cli-module-view` for more information.
 
 ### Running modules directly
 
@@ -358,7 +358,7 @@ $ nextflow module run nf-core/salmon \
     -resume
 ```
 
-Process inputs can be specified like params on the command line. For example, `--reads reads.fq` corresponds to the `reads` input in the `nf-core/salmon` module. Run `nextflow module info nf-core/salmon` to see the available params for the module.
+Process inputs can be specified like params on the command line. For example, `--reads reads.fq` corresponds to the `reads` input in the `nf-core/salmon` module. Run `nextflow module view nf-core/salmon` to see the available params for the module.
 
 See {ref}`cli-module-run` for more information.
 
 
@@ -23,8 +23,8 @@ nextflow module run nf-core/fastqc --meta.id 1 --reads sample1.fastq
 # Search for modules in registry
 nextflow module search bwa
 
-# Get info about a module
-nextflow module info nf-core/bwa/mem
+# View info about a module
+nextflow module view nf-core/bwa/mem
 
 # Install a module
 nextflow module install nf-core/bwa/mem
 
@@ -71,17 +71,17 @@ See {ref}`cli-module-list` for the full command reference.
 
 ## Viewing module information
 
-Use the `module info` command to view metadata and a usage template for a module:
+Use the `module view` command to view metadata and a usage template for a module:
 
 ```console
-$ nextflow module info nf-core/fastqc
-$ nextflow module info nf-core/fastqc -version 0.0.0-0c7146d
+$ nextflow module view nf-core/fastqc
+$ nextflow module view nf-core/fastqc -version 0.0.0-0c7146d
 ```
 
 The output includes the module's version, URL, description, authors, maintainers, keywords, tools, input/output channels, and a generated usage template.
 Use `-output json` for machine-readable output.
 
-See {ref}`cli-module-info` for the full command reference.
+See {ref}`cli-module-view` for the full command reference.
 
 ## Running modules directly
 
@@ -92,7 +92,7 @@ $ nextflow module run nf-core/fastqc --meta.id=test_sample --reads sample1_R1.fa
 ```
 
 :::{tip}
-Run `nextflow module info` to see the available inputs for a module.
+Run `nextflow module view` to see the available inputs for a module.
 :::
 
 The command automatically downloads the module if it is not already installed.