nf-core
diff --git a/‎.dockstore.yml‎
Lines changed: 5 additions & 0 deletions b/‎.dockstore.yml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎.github/.dockstore.yml‎
Lines changed: 5 additions & 0 deletions b/‎.github/.dockstore.yml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎.github/CONTRIBUTING.md‎
Lines changed: 154 additions & 4 deletions b/‎.github/CONTRIBUTING.md‎
Lines changed: 154 additions & 4 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 21 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/bug_report.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/feature_request.md‎
Lines changed: 7 additions & 5 deletions b/‎.github/ISSUE_TEMPLATE/feature_request.md‎
Lines changed: 7 additions & 5 deletions
diff --git a/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 2 additions & 0 deletions b/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.github/markdownlint.yml‎
Lines changed: 3 additions & 0 deletions b/‎.github/markdownlint.yml‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎.github/workflows/awsfulltest.yml‎
Lines changed: 39 additions & 0 deletions b/‎.github/workflows/awsfulltest.yml‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎.github/workflows/awstest.yml‎
Lines changed: 39 additions & 0 deletions b/‎.github/workflows/awstest.yml‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎.github/workflows/branch.yml‎
Lines changed: 26 additions & 5 deletions b/‎.github/workflows/branch.yml‎
Lines changed: 26 additions & 5 deletions
@@ -0,0 +1,5 @@
+# Dockstore config version, not pipeline version
+version: 1.2
+workflows:
+  - subclass: nfl
+    primaryDescriptorPath: /nextflow.config
@@ -0,0 +1,5 @@
+# Dockstore config version, not pipeline version
+version: 1.2
+workflows:
+  - subclass: nfl
+    primaryDescriptorPath: /nextflow.config
@@ -18,8 +18,9 @@ If you'd like to write some code for nf-core/eager, the standard workflow is as
 1. Check that there isn't already an issue about your idea in the [nf-core/eager issues](https://github.com/nf-core/eager/issues) to avoid duplicating work
     * If there isn't one already, please create one so that others know you're working on this
 2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/eager repository](https://github.com/nf-core/eager) to your GitHub account
-3. Make the necessary changes / additions within your forked repository
-4. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
+3. Make the necessary changes / additions within your forked repository (following [code contribution guidelines](https://github.com/nf-core/eager/blob/dev/.github/CONTRIBUTING.md))
+4. Use `nf-core schema build .` and add any new parameters to the pipeline JSON schema (requires nf-core tools >= 1.10).
+5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
 
 If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/).
 
@@ -46,12 +47,161 @@ These tests are run both with the latest available version of `Nextflow` and als
 
 ## Patch
 
-: warning: Only in the unlikely and regretful event of a release happening with a bug.
+:warning: Only in the unlikely and regretful event of a release happening with a bug.
 
 * On your own fork, make a new branch `patch` based on `upstream/master`.
 * Fix the bug, and bump version (X.Y.Z+1).
 * A PR should be made on `master` from patch to directly this particular bug.
 
 ## Getting help
 
-For further information/help, please consult the [nf-core/eager documentation](https://nf-co.re/nf-core/eager/docs) and don't hesitate to get in touch on the nf-core Slack [#eager](https://nfcore.slack.com/channels/eager) channel ([join our Slack here](https://nf-co.re/join/slack)).
+For further information/help, please consult the [nf-core/eager documentation](https://nf-co.re/eager/latest/usage) and don't hesitate to get in touch on the nf-core Slack [#eager](https://nfcore.slack.com/channels/eager) channel ([join our Slack here](https://nf-co.re/join/slack)).
+
+# Code Contribution Guidelines
+
+To make the EAGER2 code and processing logic more understandable for new contributors, and to ensure quality. We are making an attempt to somewhat-standardise the way the code is written.
+
+If you wish to contribute a new module, please use the following coding standards.
+
+The typical workflow for adding a new module is as follows:
+
+1. Define the corresponding input channel into your new process from the expected previous process channel (or re-routing block, see below).
+2. Write the process block (see below).
+3. Define the output channel if needed (see below).
+4. Add any new flags/options to `nextflow.config` with a default (see below).
+5. Add any new flags/options to `nextflow_schema.json` with help text (with `nf-core schema build .`)
+6. Add any new flags/options to the help message (for integer/text parameters, print to help the corresponding `nextflow.config` parameter).
+7. Add sanity checks for all relevant parameters.
+8. Add any new software to the `scrape_software_versions.py` script in `bin/` and the version command to the `scrape_software_versions` process in `main.nf`.
+9. Do local tests that the new code works properly and as expected.
+10. Add a new test command in `.github/workflow/ci.yaml`.
+11. If applicable add a [MultiQC](https://https://multiqc.info/) module.
+12. Update MultiQC config `assets/multiqc_config.yaml` so relevant suffixes, name clean up, General Statistics Table column order, and module figures are in the right order.
+13. Add new flags/options to 'usage' documentation under `docs/usage.md`.
+14. Add any descriptions of MultiQC report sections and output files to `docs/output.md`.
+
+## Default Values
+
+Default values should go in `nextflow.config` under the `params` scope, and `nextflow_schema.json` (latter with `nf-core schema build .`)
+
+## Default resource processes
+
+Defining recommended 'minimum' resource requirements (CPUs/Memory) for a process should be defined in `conf/base.config`. This can be utilised within the process using `${task.cpu}` or `${task.memory}` variables in the `script:` block.
+
+## Process Concept
+
+We are providing a highly configurable pipeline, with many options to turn on and off different processes in different combinations. This can make a very complex graph structure that can cause a large amount of duplicated channels coming out of every process to account for each possible combination.
+
+The EAGER pipeline can currently be broken down into the following 'stages', where a stage is a collection of  non-terminal mutually exclusive processes, which is the output of which is used for another file reporting module (but not reporting!) .
+
+* Input
+* Convert BAM
+* PolyG Clipping
+* AdapterRemoval
+* Mapping (either `bwa`, `bwamem`, or `circularmapper`)
+* BAM Filtering
+* Deduplication (either `dedup` or `markduplicates`)
+* BAM Trimming
+* PMDtools
+* Genotyping
+
+Every step can potentially be skipped, therefore the output of a previous stage must be able to be passed to the next stage, if the given stage is not run.
+
+To somewhat simplify this logic, we have implemented the following structure.
+
+The concept is as follows:
+
+* Every 'stage' of the pipeline (i.e. collection of mutually exclusive processes) must always have a if else statement following it.
+* This if else 'bypass' statement collects and standardises all possible input files into single channel(s) for the next stage.
+* Importantly - within the bypass statement, a channel from the previous stage's bypass mixes into these output channels. This additional channel is named `ch_previousstage_for_skipcurrentstage`. This contains the output from the previous stage, i.e. not the modified version from the current stage.
+* The bypass statement works as follows:
+  * If the current stage is turned on: will mix the previous stage and current stage output and filter for file suffixes unique to the current stage output
+  * If the current stage is turned off or skipped: will mix the previous stage and current stage output. However as there there is no files in the output channel from the current stage, no filtering is required and the files in the 'ch_XXX_for_skipXXX' stage will be used.
+  
+ This ensures the same channel inputs to the next stage is 'homogeneous' - i.e. all comes from the same source (the bypass statement)
+  
+ An example schematic can be given as follows
+
+```nextflow
+ // PREVIOUS STAGE OUTPUT
+if (params.run_bam_filtering) {
+    ch_input_for_skipconvertbam.mix(ch_output_ch_convertbam)
+        .filter{ it =~/.*converted.fq/}
+        .into { ch_convertbam_for_fastp; ch_convertbam_for_skipfastp }
+} else {
+    ch_input_for_skipconvertbam
+        .into { ch_convertbam_for_fastp; ch_convertbam_for_skipfastp }
+}
+
+// SKIPPABLE CURRENT STAGE PROCESS
+process fastp {
+    publishDir "${params.outdir}/fastp", mode: 'copy'
+
+    when:
+    params.run_fastp
+
+    input:
+    file fq from ch_convertbam_for_fastp
+
+    output:
+    file "*pG.fq" into ch_output_from_fastp
+
+    script:
+    """
+    echo "I have been fastp'd" > ${fq}  
+    mv ${fq} ${fq}.pG.fq
+    """
+}
+
+// NEXT STAGE INPUT PREPARATION
+if (params.run_fastp) {
+    ch_convertbam_for_skipfastp.mix(ch_output_from_fastp)
+        .filter { it =~/.*pG.fq/ }
+        .into { ch_fastp_for_adapterremoval; ch_fastp_for_skipadapterremoval }
+} else {
+    ch_convertbam_for_skipfastp
+        .into { ch_fastp_for_adapterremoval; ch_fastp_for_skipadapterremoval }
+}
+
+ ```
+
+## Naming Schemes
+
+Please use the following naming schemes, to make it easy to understand what is going where.
+
+* process output: `ch_output_from_<process>`(this should always go into the bypass statement described above).
+* skipped process output: `ch_<previousstage>_for_<skipprocess>`(this goes out of the bypass statement described above)
+* process inputs: `ch_<previousstage>_for_<process>` (this goes into a process)
+
+## Nextflow Version Bumping
+
+If you have agreement from reviewers, you may bump the 'default' minimum version of nextflow (e.g. for testing), with `nf-core bump-version`.
+
+## Software Version Reporting
+
+If you add a new tool to the pipeline, please ensure you add the information of the tool to the `get_software_version` process.
+
+Add to the script block of the process, something like the following:
+
+```bash
+<YOUR_TOOL> --version &> v_<YOUR_TOOL>.txt 2>&1 || true
+```
+
+or
+
+```bash
+<YOUR_TOOL> --help | head -n 1 &> v_<YOUR_TOOL>.txt 2>&1 || true
+```
+
+You then need to edit the script `bin/scrape_software_versions.py` to
+
+1. add a (python) regex for your tools --version output (as in stored in the `v_<YOUR_TOOL>.txt` file), to ensure the version is reported as a `v` and the version number e.g. `v2.1.1`
+2. add a HTML block entry to the `OrderedDict` for formatting in MultiQC.
+
+> If a tool does not unfortunately offer any printing of version data, you may add this 'manually' e.g. with `echo "v1.1" > v_<YOUR_TOOL>.txt`
+
+## Images and Figures
+
+For all internal nf-core/eager documentation images we are using the 'Kalam' font by the Indian Type Foundry and licensed under the Open Font License. It can be found for download here [here](https://fonts.google.com/specimen/Kalam).
+
+For the overview image we follow the nf-core [style guidelines](https://nf-co.re/developers/design_guidelines).
@@ -1,9 +1,24 @@
 # nf-core/eager bug report
 
+<!--
 Hi there!
 
 Thanks for telling us about a problem with the pipeline.
+
 Please delete this text and anything that's not relevant from the template below:
+-->
+
+## Check Documentation
+
+Have you checked in the following places for your error?:
+
+- [ ] [Frequently Asked Questions](https://github.com/nf-core/eager/blob/master/docs/usage.md#troubleshooting-and-faqs)
+      (for nf-core/eager specific information)
+- [ ] [Troubleshooting](https://nf-co.re/usage/troubleshooting)
+      (for nf-core specific information)
+
+Please also check the the corresponding version's documentation on github, if not
+testing the latest release.
 
 ## Describe the bug
 
@@ -20,6 +35,12 @@ Steps to reproduce the behaviour:
 
 A clear and concise description of what you expected to happen.
 
+## Log files
+
+1. The command used to run the pipeline
+2. The `.nextflow.log` file (which is a hidden file in whichever place you _ran_
+   the pipeline from - not necessarily in the output directory!)
+
 ## System
 
 - Hardware: <!-- [e.g. HPC, Desktop, Cloud...] -->
 
@@ -1,24 +1,26 @@
 # nf-core/eager feature request
+<!--
 
 Hi there!
 
 Thanks for suggesting a new feature for the pipeline!
 Please delete this text and anything that's not relevant from the template below:
+-->
 
 ## Is your feature request related to a problem? Please describe
 
-A clear and concise description of what the problem is.
+<!-- A clear and concise description of what the problem is. -->
 
-Ex. I'm always frustrated when [...]
+<!-- e.g. [I'm always frustrated when ...] -->
 
 ## Describe the solution you'd like
 
-A clear and concise description of what you want to happen.
+<!-- A clear and concise description of what you want to happen. -->
 
 ## Describe alternatives you've considered
 
-A clear and concise description of any alternative solutions or features you've considered.
+<!-- A clear and concise description of any alternative solutions or features you've considered. -->
 
 ## Additional context
 
-Add any other context about the feature request here.
+<!-- Add any other context about the feature request here. -->
@@ -1,9 +1,11 @@
 # nf-core/eager pull request
+<!--
 
 Many thanks for contributing to nf-core/eager!
 
 Please fill in the appropriate checklist below (delete whatever is not relevant).
 These are the most common things requested on pull requests (PRs).
+-->
 
 ## PR checklist
 
 
@@ -7,3 +7,6 @@ no-inline-html:
     allowed_elements:
         - img
         - p
+        - kbd
+        - details
+        - summary
@@ -0,0 +1,39 @@
+
+name: nf-core AWS full size tests
+# This workflow is triggered on push to the master branch.
+
+on:
+  release:
+    types: [published]
+
+jobs:
+  run-awstest:
+    name: Run AWS full tests
+    if: github.repository == 'nf-core/eager'
+    runs-on: ubuntu-latest
+    steps:
+      - name: Setup Miniconda
+        uses: goanpeca/setup-miniconda@v1.0.2
+        with:
+          auto-update-conda: true
+          python-version: 3.7
+      - name: Install awscli
+        run: conda install -c conda-forge awscli
+      - name: Start AWS batch job
+        # Add full size test data (but still relatively small datasets for few samples)
+        # on the `test_full.config` test runs with only one set of parameters
+        # Then specify `-profile test_full` instead of `-profile test` on the AWS batch command
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          TOWER_ACCESS_TOKEN: ${{ secrets.AWS_TOWER_TOKEN }}
+          AWS_JOB_DEFINITION: ${{ secrets.AWS_JOB_DEFINITION }}
+          AWS_JOB_QUEUE: ${{ secrets.AWS_JOB_QUEUE }}
+          AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
+        run: |
+          aws batch submit-job \
+            --region eu-west-1 \
+            --job-name nf-core-eager \
+            --job-queue $AWS_JOB_QUEUE \
+            --job-definition $AWS_JOB_DEFINITION \
+            --container-overrides '{"command": ["nf-core/eager", "-r '"${GITHUB_SHA}"' -profile awsfulltest --outdir s3://'"${AWS_S3_BUCKET}"'/eager/results-'"${GITHUB_SHA}"' -w s3://'"${AWS_S3_BUCKET}"'/eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'
@@ -0,0 +1,39 @@
+name: nf-core AWS test
+# This workflow is triggered on push to the master branch.
+# It runs the -profile 'test' on AWS batch
+
+on:
+  push:
+    branches:
+      - master
+
+jobs:
+  run-awstest:
+    name: Run AWS tests
+    if: github.repository == 'nf-core/eager'
+    runs-on: ubuntu-latest
+    steps:
+      - name: Setup Miniconda
+        uses: goanpeca/setup-miniconda@v1.0.2
+        with:
+          auto-update-conda: true
+          python-version: 3.7
+      - name: Install awscli
+        run: conda install -c conda-forge awscli
+      - name: Start AWS batch job
+        # For example: adding multiple test runs with different parameters
+        # Remember that you can parallelise this by using strategy.matrix
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          TOWER_ACCESS_TOKEN: ${{ secrets.AWS_TOWER_TOKEN }}
+          AWS_JOB_DEFINITION: ${{ secrets.AWS_JOB_DEFINITION }}
+          AWS_JOB_QUEUE: ${{ secrets.AWS_JOB_QUEUE }}
+          AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
+        run: |
+          aws batch submit-job \
+          --region eu-west-1 \
+          --job-name nf-core-eager \
+          --job-queue $AWS_JOB_QUEUE \
+          --job-definition $AWS_JOB_DEFINITION \
+          --container-overrides '{"command": ["nf-core/eager", "-r '"${GITHUB_SHA}"' -profile test_tsv_complex --outdir s3://'"${AWS_S3_BUCKET}"'/eager/results-'"${GITHUB_SHA}"' -w s3://'"${AWS_S3_BUCKET}"'/eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'
@@ -3,14 +3,35 @@ name: nf-core branch protection
 # It fails when someone tries to make a PR against the nf-core `master` branch instead of `dev`
 on:
   pull_request:
-    branches:
-    - master
+    branches: [master]
 
 jobs:
   test:
-    runs-on: ubuntu-18.04
+    runs-on: ubuntu-latest
     steps:
-      # PRs are only ok if coming from an nf-core `dev` branch or a fork `patch` branch
+      # PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
       - name: Check PRs
+        if: github.repository == 'nf-core/eager'
         run: |
-          { [[ $(git remote get-url origin) == *nf-core/eager ]] && [[ ${GITHUB_HEAD_REF} = "dev" ]]; } || [[ ${GITHUB_HEAD_REF} == "patch" ]]
+          { [[ ${{github.event.pull_request.head.repo.full_name}} == nf-core/eager ]] && [[ $GITHUB_HEAD_REF = "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
+
+
+      # If the above check failed, post a comment on the PR explaining the failure
+      # NOTE - this doesn't currently work if the PR is coming from a fork, due to limitations in GitHub actions secrets
+      - name: Post PR comment
+        if: failure()
+        uses: mshick/add-pr-comment@v1
+        with:
+          message: |
+            Hi @${{ github.event.pull_request.user.login }},
+
+            It looks like this pull-request is has been made against the ${{github.event.pull_request.head.repo.full_name}} `master` branch.
+            The `master` branch on nf-core repositories should always contain code from the latest release.
+            Because of this, PRs to `master` are only allowed if they come from the ${{github.event.pull_request.head.repo.full_name}} `dev` branch.
+
+            You do not need to close this PR, you can change the target branch to `dev` by clicking the _"Edit"_ button at the top of this page.
+
+            Thanks again for your contribution!
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+          allow-repeats: false
+
-Original file line number
+Diff line change
     allowed_elements:
         - img
         - p
 +        - kbd
 +        - details
 +        - summary