-
Notifications
You must be signed in to change notification settings - Fork 2
Browsing and searching datasets
Note: afterParty is under active development, so the current version may differ slightly from screenshots.
Links to help on individual pages:
Studies and species pages
The assembly page
The contig set page
Searching contig annotation
BLASTing sequences against contigs
Viewing details for a single contig
The highest level of organization for afterParty websites is the study. Most afterParty instances (e.g. the public-facing instance at afterparty.bio.ed.ac.uk) hold multiple studies. A study represents a collection of related transcriptomes, often from a number of related species. Each study contains one or more compound samples. A compound sample is often (but not necessarily) a species. Each species can have one or more assemblies. Here's a screenshot of the study page for a typical transcriptome derived from an RNA-seq experiment. This study has only a single species.
Clicking on the species name takes us to the page for that species:
##The species page
Several of the elements on this page are re-used throughout afterParty, so we will examine them in order from top to bottom.
####Breadcrumbs bar
Most pages in afterParty have a bar like this at the top, which you can use to navigate around a dataset. Links go to pages higher up in the hierarchy.
####Action buttons for a set of contigs.
A row of buttons which allow you to carry out different actions on the set of contigs belonging to this species.
Clicking "View contig set" will take you to a page showing details of all contigs for this species (more on these pages later). Clicking "Search contigs" will bring up a search form which allows you to search the contig annotations for a search term.
Enter your query in the box and press the "search" button. You can use boolean operators like & and |. You can also control the number of results returned - asking for more results usually takes a little bit longer.
Clicking "BLAST contigs" will bring up a form that allows you to paste in a sequence and carry out a BLAST search against the contigs for this species.
Clicking "Download contigs" will download all the contigs for this species in FASTA format.
####Sample list A list of all the samples for this species - none in the case of this particular dataset.
##Assembly list A list of the assemblies for this species. This one has just a single assembly. Clicking on the assembly name will take you to the assembly page.
####Contig table An interactive table showing a list of contigs - in this case, all the contigs belonging to this species. Contig tables appear on many of the pages in afterParty and they all behave the same way.
The columns are fairly self-explanatory: contig name, length, mean quality and coverage (where available - this particular dataset doesn't have quality / coverage data so these fields all read 1) and GC content. The Annotation column contains a number of lines, each of which shows the highest-scoring annotation description of a particular type (BLAST, PFam, TIGRFam and Gene3D) along with the Evalue. Clicking on a contig name takes you to the contig detail page where you can see all annotations. You can use the pagination buttons on the bottom to page through the list of contigs and the column headers to sort (be patient!). You can type in the search box to quickly filter contigs be annotation (but if you want to carry out a more thorough search, use the search form as described above).
If we click on the name of the assembly in the assembly list, we'll end up on the assembly page, which looks very similar to the species page:
There's a small table of summary statistics in the top right, but other than that it has more-or-less the same elements. Notice that there's one more layer in the breadcrumb bar. The action buttons and the contig table refer to all contigs in this assembly, but they behave exactly the same as on the species page.
This is the page that lets you view one or more contig sets in detail. A contig set is just a collection of contigs within a study, so there are lots of ways to end up on one of these pages. We've already seen that there are per-species and per-assembly contig sets; but you can also create a contig set from the results of an annotation search (more on that below) or a BLAST search.
The top of the contig set page has the familiar elements: a row of action buttons and a table of contigs.
The bottom of the contig set page has an interactive chart showing different metrics. This bit of the page can take a long time to load, as it's transferring a lot of data. If possible, use the Google Chrome browser to view contig set pages - the javascript engine is far faster than other browsers. When the page first loads, it's set to show a histogram of contig length:
If you hover over a point on the chart, you'll see a popup with the exact values. There are various things we can do with the chart. Use the "Y-axis scale" button to switch from a linear to log scale:
Or use the "Frequency scaling" button to switch the Y-axis from absolute number of contigs to number per thousand contigs:
Use the "x-axis displays" button to view histograms of a different metric e.g. GC content:
To zoom in on an interesting bit of the chart, use the "cursor function" button to switch to zoom mode then click and drag to select the bit of the chart you want to see. Use the red "reset zoom" button to see the whole chart again. Here's the portion of the length histogram chart from 0 to 1400 bases:
If you're browsing afterParty on a screen with a low resolution, you can use the "chart size" button to make the chart area smaller. Alternatively, if you have a big monitor and high resolution, you can make it bigger which makes it easier to see individual points.
As well as histograms, this page can also display scatter plots where each point is a contig. Use the "chart type" button to switch. Here's a scatter plot of the same contig set showing length vs GC:
Mini-histograms are automatically drawn at the top and side. You can change the x and y axis independently using both log and linear scales. Hovering over a point on the chart shows a pop-up with detailed metrics for the contig:
and you can select an area to zoom in:
Once you've selected an area of the chart, you can create a new contig set from only the contigs inside that area. This allows you to easily create sets of contigs with certain properties (e.g. longer than 3000 bases, coverage greater than 20, GC content between 45% and 50%, etc. etc.) using the green "save as contig set" button.
All charts can be filtered by minimum sequence length and minimum coverage using the text entry fields above the chart.
afterParty is designed so that you can search contig annotation from most places where it makes sense to do so. Anywhere you see the search form:
you can search a set of contigs. Currently the search form appears on the Compound Sample, Assembly and Contig Set pages. If you select one or more of the Contig Sets listed at the bottom of the Study page, you can also search their contigs' annotation.
The search results page contains the usual contig table, showing just the contigs that had matching annotation.
From here you can do all the usual things - page through the results, search within the results, click on a contig name to go to the contig details page, or save the set of search results as a new contig set.
As well as searching contig annotation, you can search for contigs with similarity to a sequence of interest. You can BLAST against contigs using the same form as when you search. Just click the "BLAST contigs" button and you'll see the BLAST form:
Past in your sequence and hit "submit sequence". The BLAST results page has a graphical view of the matching contigs, and a table. As you might expect by now, you can also save this set of contigs as a named set.
Clicking on the name of a contig in any of the various contig tables that you will encounter in afterParty will take you to the detail page for that contig. This page shows all the metrics and annotation currently associated with the contig.
The annotation diagram at the top has graphs of quality and coverage along the contig if they're available. Quality and coverage scores will generally only be available if the assembly was uploaded in ACE format - if the user who created the dataset you are looking at used FASTA format, then quality/coverage data will not be available. Quality scores follow the PHRED specification, see Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8: 186-194. for more details.
After the charts come diagrams showing the distribution of various types of annotation (BLAST hits to known sequences, protein domain families, and localization features). These annotations are also displayed in tables at the bottom of the page, which link out to relevant resources.





















