Skip to content
Andrea Guarracino edited this page Apr 3, 2020 · 51 revisions

Investigating Multi-Layer Perceptrons, Convolutional Neural Networks, Regression Models, and Ensembl methods for prediction of disease progression, impact of geographical distribution, etc.

(Side Note: There seems to be some overlap between this Task and the BioStatistics Task. It may be worth considering merging these two.)

Communication

For the time being, there is a #machinelearning channel on the Slack group (check out the virtual-biohackathon@googlegroups.com group for the invitation link). During the BioHackathon, we'll update this section.

Resources

Please check out the Datasets and Tools page.

Any new resources you might have in mind, please add them there directly.

Ideas for Projects

Machine learning requires much computing resources, in many cases GPUs. Kubeflow, as a highly portable and cloud native platform for workflows, is highly optimised for machine learning. Containerised workloads can easily be ported onto it.

  • Apply Markovian Clustering (MCL) on the currently available SARS-CoV-2 sequences GenBank sequences in order to identify potential groupings beyond the traditional phylogenetic ones. Apply both at the NT and the AA level, based on a number of distance metrics (aka e-value, string distance, etc).

  • Diagnose COVID-19 based on image data from CT scans and X-rays, using neural net models for image classification.

Participants

Clone this wiki locally