Skip to content

Commit 2728f21

Browse files
Sample sheet check will raise error when one sample-replicate has more than one antibody specified, added further explanation in usage
1 parent fa49f46 commit 2728f21

3 files changed

Lines changed: 26 additions & 13 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2121
- [[PR #493](https://github.com/nf-core/chipseq/pull/493)] - Follow up to #487.
2222
- [[#492](https://github.com/nf-core/chipseq/issues/492), [#417](https://github.com/nf-core/chipseq/issues/417)] - Refactor local modules to nf-core standard.
2323
- [[#416](https://github.com/nf-core/chipseq/issues/416)] - Moved the KHMER_UNIQUEKMERS logic to prepare_genome
24-
- [[#510](https://github.com/nf-core/chipseq/issues/510)] - Restrict the usage to one IP replicate against one control see: [#440](https://github.com/nf-core/chipseq/issues/440)
25-
replicate.
24+
- [[#440](https://github.com/nf-core/chipseq/issues/440), [#510](https://github.com/nf-core/chipseq/issues/510)] - Fix
25+
naming collisions when sample and replicate combination is identical for multiple antibodies see.
26+
- [[#467](https://github.com/nf-core/chipseq/issues/467), [#510](https://github.com/nf-core/chipseq/issues/510)] -
27+
Restrict the usage to one IP against one control replicate.
2628

2729
### Parameters
2830

bin/check_samplesheet.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,11 +212,13 @@ def check_samplesheet(file_in, file_out):
212212
sample,
213213
)
214214

215+
set_antibodies = set()
215216
set_control_replicates = set()
217+
216218
for idx, val in enumerate(sample_mapping_dict[sample][replicate]):
217219
control = "_REP".join(val[-1].split("_REP")[:-1])
218220
control_replicate = val[-1].split("_REP")[-1]
219-
set_control_replicates.update(control_replicate)
221+
set_control_replicates.add(control_replicate)
220222

221223
if control and (
222224
control not in sample_mapping_dict.keys()
@@ -228,10 +230,19 @@ def check_samplesheet(file_in, file_out):
228230
val[-1],
229231
)
230232

233+
for x in sample_mapping_dict[sample][replicate]:
234+
set_antibodies.add(x[4])
235+
236+
# Check that a given sample replicate only uses one antibody
237+
if len(set_antibodies) > 1:
238+
print_error(
239+
f"Sample: {sample}, replicate {replicate} has more than one antibody specified!"
240+
)
241+
231242
# Check that a given sample-replicate have only one control replicate
232243
if len(set_control_replicates) > 1:
233244
print_error(
234-
f"Sample: {sample}, replicate {replicate} has more than one control replicate! Revise the experimental design, see: 'Note on IP and control replicates'"
245+
f"Sample: {sample}, replicate {replicate} has more than one control replicate specified! Revise the experimental design, see: 'Note on IP and control replicates'"
235246
)
236247

237248
## Write to file

docs/usage.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -119,15 +119,15 @@ NAIVE_INPUT,BLA203A48_S39_L001_R1_001.fastq.gz,,2,,,
119119
NAIVE_INPUT,BLA203A49_S1_L006_R1_001.fastq.gz,,3,,,
120120
```
121121

122-
| Column | Description |
123-
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
124-
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
125-
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
126-
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
127-
| `replicate` | Integer representing replicate number. This will be identical for re-sequenced libraries. Must start from `1..<number of replicates>`. |
128-
| `antibody` | Antibody name. This is required to segregate downstream analysis for different antibodies. Required when `control` is specified. |
129-
| `control` | Sample name for control sample. |
130-
| `control_replicate` | Integer representing replicate number for control sample. |
122+
| Column | Description |
123+
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
124+
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). It should be unique and contain the antibody name. E.g: `{Treatment or cell type}_{antibody}_IP` |
125+
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
126+
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
127+
| `replicate` | Integer representing replicate number. This will be identical for re-sequenced libraries. Must start from `1..<number of replicates>`. |
128+
| `antibody` | Antibody name. This is required to segregate downstream analysis for different antibodies. Required when `control` is specified. |
129+
| `control` | Sample name for control sample. |
130+
| `control_replicate` | Integer representing replicate number for control sample. |
131131

132132
Example design files have been provided with the pipeline for [paired-end](../assets/samplesheet_pe.csv) and [single-end](../assets/samplesheet_se.csv) data.
133133

0 commit comments

Comments
 (0)