Skip to content
Fotis E. Psomopoulos edited this page Apr 5, 2020 · 51 revisions

Investigating feature identification approaches for AA sequences, K-mers, in-silico prediction of epitopes, and study of secondary structures.

Project Home

GitHub Repo

Communication

For the time being, there is a #machinelearning channel on the Slack group (check out the virtual-biohackathon@googlegroups.com group for the invitation link). During the BioHackathon, we'll update this section.

We've setup a dedicated GitHub organization here. For a detailed list of all tasks, code and resources, please go there. This page will be updated for the main points only.

Coordination calls

  • 1st e-meeting Sunday, April 5th @ 17:00 CEST, using zoom.
  • 2nd e-meeting: tbd

Participants

Resources

Please check out the Datasets and Tools page.

Any new resources you might have in mind, please add them there directly.

Ideas for Projects

Left here for reference / legacy - refer to the covid19-bh-machine-learning GitHub repo for details.

Investigating Multi-Layer Perceptrons, Convolutional Neural Networks, Regression Models, and Ensembl methods for prediction of disease progression, impact of geographical distribution, etc.

(Side Note: There seems to be some overlap between this Task and the BioStatistics Task. It may be worth considering merging these two.)

  • Machine learning requires much computing resources, in many cases GPUs. Kubeflow, as a highly portable and cloud native platform for workflows, is highly optimised for machine learning. Containerised workloads can easily be ported onto it.

  • Apply Markovian Clustering (MCL) on the currently available SARS-CoV-2 sequences GenBank sequences in order to identify potential groupings beyond the traditional phylogenetic ones. Apply both at the NT and the AA level, based on a number of distance metrics (aka e-value, string distance, etc).

  • Diagnose COVID-19 based on image data from CT scans and X-rays, using neural net models for image classification.

Clone this wiki locally