Nextflow should not stage files that have the same name

I collect files from multiple subdirectories and work on them in a single process. Nextflow does not complain if two files have the same basename, which leads to silent data loss. It seems that when it stages them, the second symlink overwrites the first one in the working directory.

To reproduce, run `mkdir subdir1 subdir2 && echo hello > subdir1/file && echo world > subdir2/file` and then run this workflow:
```
c = Channel.from([
  [file('subdir1/file'), file('subdir2/file')]])

process p {
  publishDir '.'

  input: file(x) from c
  output: file('concatenated')

  "cat $x > concatenated"
}
```
The intention was to get an output file that contains `hello\nworld\n`. Instead, I get `world\nworld\n`.

To give a little bit of context: In the actual pipeline, the process works with multiple FASTQ files that come from the same individual but were sequenced in different runs. They are stored in different directories, but the file (base-)names are in the standard Illumina scheme `<sample-name>_S<sample-index>_L<lane-index>_R1_001.fastq.gz`. With the sample name being identical (since they come from same individual), a collision occurs when - by chance - the other run of that sample used the same sample index and the same lane.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nextflow should not stage files that have the same name #470

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nextflow should not stage files that have the same name #470

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions