AWS megatests outputs page by mashehu · Pull Request #514 · nf-core/website

mashehu · 2020-10-02T16:37:12Z

Basically copy-pasted from aws-js-s3-explorer. Files are currently downloaded when clicking their name in the table and not yet rendered on page, as was suggested.
I am actually not 100% sure, how to handle that. Should we basically have a download for non-html files and html files will be rendered on that page (with the file table above or below)?

We also need to figure out a way to handle the dev branch, since it doesn't appear in pipeline.json yet, so we don't have the commit-hash for it (which is needed for the path in the aws bucket). Similarly, for versions without a aws results, the "example output" tab should be hidden.

Closes #435.

add releases hashes to $releases array

ewels · 2020-10-06T10:28:44Z

Absolutely love it 🚀

For dev, could maybe show a list of unrecognised commits? Or even just query the GitHub API to find the commit for the latest dev? Not sure what would be best.

Rendering HTML / txt files etc would definitely be nice. May be tricky as I guess they are downloading due to the header that is being set from amazon though. Not sure how we can change / bypass that?

Minor stuff:

Missing path separators in the top path: (needs some CSS :after magic for the breadcrumbs I guess)
Full-width page might be nicer?
When I load a new pipeline the breadcrumbs show that I'm in the folder for the results, but the browser shows that top-level directory that I have to click to see inside
Would be nice to hide the top-level directories above that release perhaps, so that you can't (easily) browse off to other pipelines etc or lose your place if you click too high in the breadcrumbs

ewels · 2020-10-07T14:20:43Z

Also, I have a feeling that in some cases files are being shown even when they should be hidden within a folder? eg. digging around various viralrecon runs.

There are also a crap-tonne of browser console logs (expected if still dev code I guess), plus a loading spinner would be quite nice when it's having a proper thing about stuff 👀

Still totally love how well this works - is pretty much exactly what I was hoping for 😍

ewels · 2020-10-08T08:09:55Z

Added a couple of very minor tweaks just so that I can take some screengrabs for a talk 😉

ggabernet · 2020-10-10T11:15:12Z

Great work, the viewer looks amazing! 😍 🚀

I just have some minor things that I saw when testing it with the dark website mode:

when using the dark view mode of the website the html background is also dark which makes it difficult to read.
In the dark theme the AWS full path at the top is difficult to read because of the light colour.

It might be nice to have the possibility of downloading the whole results folder in addition to individual files, if that is possible.

Alright, now just the example outputs need to be in the bucket! I'll try to go over the most recent releases of the pipelines and execute the runs in AWS with the release commit so that at least the "small" tests results are there as a showcase.

ewels

Just had another look at this and I absolutely love it 😍 The new file icons are an absolute winner..

The one thing I find a bit unintuitive still is that clicking a file opens it in a new tab to download. I expect it to open folders, but the download surprises me every time.

What do you think about a file click opening the preview box underneath for everything? Then we could remove the "Preview file" buttons and just have the box preview if it meets criteria. If it doesn't, we can get the preview box to say "Binary file, 20kb. Click here to download" or whatever. I'm thinking kind of like how GitHub does it:

This then requires two clicks to download the file, which I think is appropriate given that some of the files are huge and that most people won't actually want to download files most of the time.

Co-authored-by: Phil Ewels <phil.ewels@scilifelab.se>

…to add-aws-results-page

…ible/big files, remove unused code

mashehu · 2020-10-28T14:13:25Z

After testing it, I am not so sure about the "Open in new tab" button, because it directly starts a download for html and image files (on firefox) when clicking on it and only opens text files in a new tab.
And since we have now a "download file" button in every file row, maybe we should can remove it from there and rename the "Open in new tab" button into download file button.

I couldn't figure out the URL to download a whole directory, so I just added a copy button for the s3 url of it (the button label "Copy Bucket S3 URL" needs some rephrasing I think 🙂)

…-aws-results-page # Conflicts: # includes/pipeline_page/_index.php

ewels · 2020-10-28T21:12:47Z

Ok nice - yes I agree, switch the text for the new tab button for a download button (as that's what it does most of the time) and remove the download button from the table 👍🏻

Can we also add some additional checks to prevent previewing the file in the browser in some cases? eg. very large files and certain file extensions (.xml). On the mhcquant page I nearly crashed my browser by clicking on a 6MB XML file that was loaded into the DOM (didn't show anything except a very very long empty <pre> element with lots of invisible content).

Checked with .bam and .bai files and it showed No preview available for binary or compressed files. very nicely 🙌🏻

Only other request that I can think of is making the fontawesome file icon part of the link. I keep clicking on them and nothing happens 😉

ewels · 2020-10-28T21:17:11Z

Just poking a bit and seems to be using <iframe srcdoc=""> with inline file contents. Any reason to do this instead of just <iframe src=""> with the URL?

mashehu · 2020-10-29T12:08:54Z

I used <iframe srcdoc=""> because some files for example the execution_*.html inside the pipeline_info directory, don't have the correct content-type, when requested. So a <iframe src=""> just shows a white page and starts a file download instead.

fix iframe src setting, remove file download buttons, add icons into link, fix xml preview.

ewels · 2020-10-29T13:41:12Z

Right, gotcha - yes and it works beautifully to get around the HTML headers. So we're stuck between a rock and a hard place - either download some HTML files or break the rendering of others.

I can see the difference between these two HTML files in their metadata on s3:

`execution_report.html` (triggers download)	`qualimapReport.html` (renders in browser)

@pditommaso - do you have any idea why files with the same filetype would be uploaded to s3 by Nextflow with different Content-type headers?

As the in-browser rendering is working great for all other filetypes, we could simply have a cron-job that loops over all HTML files in the s3 bucket and fixes the content-type headers. Then we can use regular iframes and have the reports load properly. What do you think @mashehu?

ewels · 2020-10-29T13:52:07Z

Confirmed - if you edit the content-header to text/html, the execution HTML reports view in browser (assuming that your browser doesn't have a cache of them).

mashehu · 2020-10-29T13:52:59Z

Yes, I think the majority should load fine by using src=. So whatever solution you find for the AWS side of this is fine for me. 🙂

I was playing around on how to do the handle the dev version. I couldn't figure out a way to find the most recent results- folder in a bucket on AWS (it looks like only files have a date and the directory structures are a bit too complicated/inconsistent for me to sort them based on that). So now dev shows all available result directories for a pipeline.

I fixed the URL based rendering, so now we can deep link into a specific AWS directory, e.g. http://localhost:8888/mhcquant/example_output#sarek/results-bce378e09de25bb26c388b917f93f84806d3ba27/Reports/1234N_1234N_M1/bamQC/1234N_1234N_M1/

Oh and the xml file rendering works now as well (the large xml files killed your browser because it was trying to render it as an html file 🙄 ).
I set a file size limit of 10MB, because some multiqc_reports can be quite large and I think they are one of the more important things to have a preview for. But I am happy to lower the limit, if you think this is not necessary .

ewels · 2020-10-29T14:03:15Z

The other thing that we could do to avoid messing around with AWS content headers is to tunnel HTML files through a PHP script on the server. I've done this in other cases to strip annoying Access-Control-Allow-Origin headers that prevent files being loaded into the DOM.

Basically, just have a PHP file with the following:

<?php
header("Content-type: text/html; charset=utf-8");
echo file_get_contents($_GET['url']);

Then load that in the iframe instead. Might be easier than messing around with cron jobs and AWS authentication etc.

pditommaso · 2020-10-29T15:13:58Z

do you have any idea why files with the same filetype would be uploaded to s3 by Nextflow with different Content-type headers?

Because NF is not aware of support for MIME provided by S3

mashehu · 2020-10-29T15:18:07Z

The other thing that we could do to avoid messing around with AWS content headers is to tunnel HTML files through a PHP script on the server. I've done this in other cases to strip annoying Access-Control-Allow-Origin headers that prevent files being loaded into the DOM.

Basically, just have a PHP file with the following:
<?php
header("Content-type: text/html; charset=utf-8");
echo file_get_contents($_GET['url']);
Then load that in the iframe instead. Might be easier than messing around with cron jobs and AWS authentication etc.

The problem with this solution is that we generate the iframe dynamically with javascript, so I am not sure how to do this then...

ewels · 2020-10-29T15:27:53Z

@pditommaso - Because NF is not aware of support for MIME provided by S3

So I guess there is some kind of AWS magic somewhere that is setting these automatically? They are there and they vary across different HTML files, so something is doing it 🤔

@mashehu - The problem with this solution is that we generate the iframe dynamically with javascript, so I am not sure how to do this then...

Just set the iframe src attribute (not srcdoc) to '/s3/'+url so that the URL ends up being something like:

https://nf-co.re/s3?url=http%3A%2F%2Fnf-core-awsmegatests.s3-eu-west-1.amazonaws.com%2Fsarek%2Fresults-bce378e09de25bb26c388b917f93f84806d3ba27%2FReports%2F1234N_1234N_M1%2FbamQC%2F1234N_1234N_M1%2FqualimapReport.html

Edit: Updated URL to use an encoded link from javascript encodeURIComponent() function. PHP then decodes this with urldecode() and it's safe to use in a URL

Can start being more clever with nicer URLs and url / file extension / file size validation etc, but that's the core concept. Should work fine with the iframe with dynamic content I think.

mashehu · 2020-10-29T20:48:43Z

I think we reach the limits of my PHP-skills here. 🙂
Where should I add handlers in the php code for these special URLs? Can you maybe point me to examples of what you are thinking of.

ewels · 2020-10-29T20:48:55Z

Ok I had a quick play with the redirection thing and whilst it works (headers are stripped and the files display in the browser properly), the web pages with relative links still break in the same way as they do with the iframe srcdoc. So let's just stick with that. I'm thinking that there shouldn't be so many HTML reports that do this anyway - hopefully just Qualimap. If it's a big issue then we can always come back to it.

ewels · 2020-10-29T20:51:39Z

@mashehu - are you happy to revert the changes back to srcdoc then? Sorry about this. Then I think that we're good to merge 👍🏻

ewels

Just a couple of comments left which I think can be deleted, but then LGTM for merge - amazing work 🙌🏻

Ongoing discussing on Slack #aws-megatests about fixing the content-type headers for all HTML files, so may revisit this in the future if we can get that to work. But I'd rather merge this now as it is basically working and come back to that if anything happens.

ewels · 2020-10-30T21:24:26Z

Hah, found my first bug as soon as I merged: Copy Bucket s3 URL button doesn't seem to do anything:

Shows a toast saying it worked, but nothing changed in my clipboard..

ewels · 2020-10-30T21:28:08Z

Other copy buttons for specific files are working..

apeltzer · 2020-10-31T19:05:45Z

It also doesn't seem to show anything for e.g. atacseq, eager, ...

ggabernet · 2020-11-01T15:08:02Z

@apeltzer, unfortunately the problem of the container not being built before the release AWS test is triggered is persisting and did not find a good solution for that.

However, I've manually re-triggered the workflow again for the latest release commit of the most recently released pipelines (until two months ago). The tests were triggered again for these. In the case of eager, it failed and I made a PR to hopefully fix that: nf-core/eager#600.

Proteomicslfq and epitopeprediction also fail on AWS, so the output can't be showed.

I've triggered with Tower chipseq which was passing in the past and it's running at the moment, so hopefully this solves it for this pipeline as well.

ggabernet · 2020-11-03T18:20:29Z

Hi @ewels @mashehu, I'll look into triggering a Lambda function now on all the html files. What exactly in the html needs to be converted to what?

This is one example of html file only displayed when clicking download.

Is it this line? And exactly which tag?

<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

mashehu · 2020-11-03T18:53:26Z

Some have a content-type octet-stream, but they should all be text/html. See #514 (comment)

ggabernet · 2020-11-03T19:11:59Z

Hi, sorry I still don't get exactly the issue. For me, the execution_report.html of your example, which I guess would be this one, renders fine in the browser, even though the metadata is Content-type: application/octet-stream.

ggabernet · 2020-11-03T19:13:52Z

Ah gotcha, it triggers a download in a new window

ggabernet · 2020-11-03T19:27:29Z

Unfortunately after reading about it, it does not seem to be so straight-forward:

Each Amazon S3 object has data, a key, and metadata. The object key (or key name) uniquely identifies the object in a bucket. Object metadata is a set of name-value pairs. You can set object metadata at the time you upload it. After you upload the object, you cannot modify object metadata. The only way to modify object metadata is to make a copy of the object and set the metadata.

first proof-of-concept version of example outputs page

987595c

add releases hashes to $releases array

mashehu requested a review from ewels October 2, 2020 16:52

ewels added 2 commits October 8, 2020 09:56

Addd breadcrumb-item li classes

bd6c4d6

Bugfix: delimiter was being reset due to missing checkbox

ce7ff4d

add file preview, fix breadcrumb links

9351ce6

mashehu marked this pull request as ready for review October 9, 2020 15:27

matthiasho added 3 commits October 10, 2020 09:59

disable backtracking to bucket level

bb5b6b9

add '..' back element, add colored icons, make first column wider

c677b6b

add 10MB preview limit

0d0ad39

matthiasho added 4 commits October 10, 2020 17:39

fix dark mode

515bde4

add preview to more file types

89ac8fc

add copyright/license notice (I hope this is sufficient)

d3a8c95

add sidebar to aws preview

b49e9b6

apeltzer mentioned this pull request Oct 13, 2020

Cloud Benchmarks apeltzer/eager2-paper#4

Open

add url-copy buttons

a40d0e7

ewels reviewed Oct 27, 2020

View reviewed changes

Comment thread includes/header.php Outdated

Comment thread includes/header.php Outdated

Comment thread public_html/assets/css/nf-core-dark.css Outdated

Comment thread public_html/assets/js/aws-s3-explorer.js Outdated

Apply suggestions from code review

039bc89

Co-authored-by: Phil Ewels <phil.ewels@scilifelab.se>

mashehu mentioned this pull request Oct 28, 2020

Add package manager and task runner for dependencies #541

Closed

matthiasho added 4 commits October 28, 2020 11:41

add dependencies

a0485cf

Merge branch 'add-aws-results-page' of github.com:mashehu/nf-co.re in…

ff1417d

…to add-aws-results-page

move css rule to correct file, show a bit more preview

f46839e

switch preview and download behavior, add error messages for incompat…

549ea6c

…ible/big files, remove unused code

Merge branch 'master' of https://github.com/nf-core/nf-co.re into add…

ad733fb

…-aws-results-page # Conflicts: # includes/pipeline_page/_index.php

ewels changed the title ~~first proof-of-concept version of example outputs page~~ AWS megatests outputs page Oct 28, 2020

matthiasho added 2 commits October 29, 2020 14:37

fix according to comments

735789d

fix iframe src setting, remove file download buttons, add icons into link, fix xml preview.

add url-based links into directories

931fc80

add changes in php file for deep links

88ffbf1

mashehu and others added 2 commits October 30, 2020 17:43

switch iframe back to srcdoc rendering

62583ab

Merge branch 'master' into add-aws-results-page

10a11c1

ewels approved these changes Oct 30, 2020

View reviewed changes

Comment thread includes/pipeline_page/example_output.php Outdated

Comment thread includes/pipeline_page/example_output.php Outdated

Comment thread public_html/assets/css/nf-core-dark.css Outdated

Remove commented-out code

5dd7fc5

ewels merged commit 99aafb4 into nf-core:master Oct 30, 2020

mashehu mentioned this pull request Nov 2, 2020

fix copy url button, add deep link sanity check, make file icons solid #549

Merged

Conversation

mashehu commented Oct 2, 2020

Uh oh!

ewels commented Oct 6, 2020

Uh oh!

ewels commented Oct 7, 2020

Uh oh!

ewels commented Oct 8, 2020

Uh oh!

ggabernet commented Oct 10, 2020

Uh oh!

ewels left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mashehu commented Oct 28, 2020

Uh oh!

ewels commented Oct 28, 2020

Uh oh!

ewels commented Oct 28, 2020

Uh oh!

mashehu commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020

Uh oh!

mashehu commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020

Uh oh!

pditommaso commented Oct 29, 2020

Uh oh!

mashehu commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mashehu commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020

Uh oh!

ewels commented Oct 29, 2020

Uh oh!

ewels left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ewels commented Oct 30, 2020

Uh oh!

ewels commented Oct 30, 2020

Uh oh!

apeltzer commented Oct 31, 2020

Uh oh!

ggabernet commented Nov 1, 2020

Uh oh!

ggabernet commented Nov 3, 2020

Uh oh!

mashehu commented Nov 3, 2020

Uh oh!

ggabernet commented Nov 3, 2020

Uh oh!

ggabernet commented Nov 3, 2020

Uh oh!

ggabernet commented Nov 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ewels commented Oct 29, 2020 •

edited

Loading

ggabernet commented Nov 3, 2020 •

edited

Loading