You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: adr/20260202-global-cache.md
+32-14Lines changed: 32 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,15 +53,18 @@ These limitations result in redundant computation when:
53
53
54
54
## Solution Approach
55
55
56
-
Extend the existing `nf-cloudcache`plugin to support content-addressable global caching on cloud object storage (S3, GCS, Azure Blob).
56
+
Refactor Nextflow's task caching into a plugin-extensible architecture so that different global cache implementations can be delivered as plugins, with the existing local/cloud cache behaviour preserved as the default. The first concrete global cache implementation extends `nf-cloudcache`for content-addressable caching on cloud object storage (S3, GCS, Azure Blob).
57
57
58
58
**Rationale for this approach:**
59
-
-`nf-cloudcache` already exists and handles cloud storage integration
60
-
- Cloud storage provides strong consistency guarantees for concurrent access
61
-
- Many organizations already use cloud storage for work directories
62
-
- Cloud providers support atomic operations needed for coordination
63
-
- No new infrastructure required
64
-
- Scalable and accessible from anywhere
59
+
- A pluggable architecture lets different global cache implementations co-exist without forking core code.
60
+
- Today's cache logic is hardcoded across `TaskProcessor`, `TaskHasher`, `CacheDB`, `PublishDir`, and `FilePorter` — different cache implementations need different decisions about task identity, coordination, output storage, ref counting, and cleanup. These decisions belong behind interfaces, not in core.
61
+
-`nf-cloudcache` already exists and handles cloud storage integration, providing a natural starting point for the first global cache implementation.
62
+
- Cloud storage provides strong consistency guarantees for concurrent access and supports atomic operations needed for coordination.
63
+
- Many organizations already use cloud storage for work directories.
64
+
- No new infrastructure required for the cloud-storage variant.
65
+
- Scalable and accessible from anywhere.
66
+
67
+
The refactor itself is described in `specs/260507-pluggable-cache-architecture/spec.md`. It introduces five plugin-extensible seams — pluggable `TaskHasher`, outputs-shaped cache resolution (`beginTask`/`endTask`), workdir adoption, file-usage events, and optional cleanup capability — and preserves byte-identical behaviour when no plugin is registered.
65
68
66
69
**Trade-offs:**
67
70
- Higher latency than local filesystem (~100-500ms vs ~10ms)
@@ -1050,26 +1053,41 @@ process prod_analysis { ... }
1050
1053
1051
1054
### Implementation Plan
1052
1055
1056
+
The implementation is split between (a) a core refactor that introduces plugin-extensible cache interfaces (covered by `specs/260507-pluggable-cache-architecture/spec.md`) and (b) the global cache implementation itself, which becomes the first non-default consumer of those interfaces.
1057
+
1053
1058
**Phase 0: Proof of concept (#6100)**
1054
1059
1. Associate nf-cloudcache path and workdir with the global-cache path and active resume by default
1055
1060
2. Constant sessionId (0000-000-000), remove processName from task hash
1056
1061
3. Optional: Use deep cache mode
1057
1062
1058
-
**Phase 1: Core functionality*
1059
-
1. Implement global hash algorithm (no sessionId, no processName)
3. Workdir adoption hook (`adopt`) + `WorkdirDisposition` (KEEP/DELETE) so cache implementations can move outputs into managed storage and dispose of the workdir.
1069
+
4. File-usage events (`notifyPublish`, `notifyFilePort`) for ref-counted cleanup.
1070
+
5. Optional `CleanupCapable` capability for `nextflow cache clean ...`.
1071
+
1072
+
All five seams ship with default implementations that preserve byte-identical behaviour and on-disk format.
1073
+
1074
+
**Phase 2: Global cache hasher**
1075
+
1. Implement global hash algorithm as a `TaskHasher` plugin (no sessionId, no processName).
1076
+
2. Implement content-based file hashing inside the global hasher.
1. Implement a `CacheFactory` + `CacheDB` for the global cache, wiring outputs-shaped restore, adoption (with `DELETE` disposition for cache-managed outputs), and file-usage events.
1080
+
2. Add cloud storage lock acquisition (S3 conditional PUT, GCS `ifGenerationMatch=0`, Azure `If-None-Match: *`) inside the global `beginTask` implementation.
1081
+
3. Test race condition handling.
1065
1082
1066
1083
**Phase 4: Polish**
1067
1084
1. Add configuration options
1068
-
2. Implement cache cleanup commands
1085
+
2. Implement cache cleanup commands via `CleanupCapable`
0 commit comments