You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Next generation sequencing (NGS) revolutionised biology by providing rapid and cheap access to huge amounts of DNA sequence data.
9
9
One unexpected benefit of the technology used in Illumina NGS sequencers was that it was also ideal for sequencing ultra-short ancient DNA.
10
10
11
-
In this chapter, we will go through a brief overview of how DNA is structured, how DNA sequencing works, and how most NGS sequenced DNA sequences are digitally represented.
11
+
In this chapter, we will go through:
12
+
13
+
- A brief overview of how DNA is structured
14
+
- How DNA sequencing works
15
+
- How most NGS sequenced DNA sequences are digitally represented
16
+
12
17
Finally we will cover some important considerations of NGS sequencing for ancient metagenomic datasets.
13
18
14
19
## Basic Concepts
@@ -66,7 +71,7 @@ This concept is important because it is the basis of how DNA sequencing works. W
66
71
To understand why specifically _NGS_ sequencing revolutionised the field of palaeogenomics, we need to briefly compare the differences between how we get modern and ancient DNA.
67
72
68
73
To get DNA from 'modern' samples (i.e., living organisms), first a biological tissue or sample is acquired.
69
-
You then typically break down (lyse) the cell membrane and/or walls, to release the molecular contents of the cell [@Danaeifar2022-sk].
74
+
We then typically break down (lyse) the cell membrane and/or walls, to release the molecular contents of the cell [@Danaeifar2022-sk].
70
75
Extraction protocols then use a variety of enzymes or other mechanisms to degrade the other biomolecules in the cell (e.g., proteins, lipids, RNA) so that they do not 'interfere' with the extraction of the DNA itself.
71
76
Finally, the DNA is separated and isolated out from the rest of the now-broken cell contents (purification).
72
77
@@ -239,10 +244,10 @@ The process as depicted in @fig-intro-ngs-fig-sequencingbysynthesis can be broke
239
244
240
245
On Illumina sequencers, the number of repetitions (known as cycles) typically happens either 50, 75, or 125 times, depending on the machine and the type of sequencing chemistry kit.
241
246
242
-
You can see a small fraction of such a flow cell in @fig-intro-ngs-fig-sbsimagecapture.
247
+
We can see a small fraction of such a flow cell in @fig-intro-ngs-fig-sbsimagecapture.
243
248
Each coloured dot corresponds to a cluster of DNA molecules.
244
249
At each cycle (each photo), a new nucleotide is added to the strand, and a laser is fired to excite the fluorophores.
245
-
You can see two different clusters emit different lights, as they are different DNA molecules and thus have different nucleotides at that particular 'cycle' (or position in the sequence) of the 'replication' process.
250
+
We can see two different clusters emit different lights, as they are different DNA molecules and thus have different nucleotides at that particular 'cycle' (or position in the sequence) of the 'replication' process.
246
251
By converting the emitted light to the known corresponding A, C, G, T, at each photo, we can reconstruct the sequence of the DNA molecule.
247
252
248
253
 via [EBI Training](https://www.ebi.ac.uk/training/online/courses/functional-genomics-ii-common-technologies-and-data-analysis-methods/next-generation-sequencing/second-generation-sequencing/illumina-sequencing/)](assets/images/chapters/intro-to-ngs/fig-intro-ngs-fig-sbsimagecapture.png){#fig-intro-ngs-fig-sbsimagecapture height=300px}
@@ -387,7 +392,7 @@ The rest of a FASTQ file is simply just a repeated set of these four lines.
387
392
Each line corresponds to an independent DNA cluster - and thus DNA molecule - that was sequenced.
388
393
In the case of Illumina pair-end sequencing, we will normally have two FASTQ files for each sample - and we can match the forward and reverse reading of each strand by the metadata line and a `/1` or `/2` at the end of the ID[^1].
389
394
390
-
[^1]: You can occasionally see a format called 'interleaved' FASTQ files, where the forward and reverse reads are placed right after one another in the same file, but this is not common practice any more.
395
+
[^1]: We can occasionally encounter a format called 'interleaved' FASTQ files, where the forward and reverse reads are placed right after one another in the same file, but this is not common practice any more.
391
396
392
397
## Sequencing and considerations for ancient metagenomics
0 commit comments