Skip to content

Text Mining & Analysis

Juan M. Banda edited this page Apr 7, 2020 · 18 revisions

This page is dedicated to the 'Text mining & Analysis' division of the CoVid 2019-Biohackathon 2020.

Rich analyses shall be done, explanatory visualizations & dashboard shall be made, datasets shall be curated & maintained for future scientific research projects.

Research hypotheses & mini-publications are invited in the realms of:

Transmission, incubation, and environmental stability of SARS-COV-2

Risk factors for CoVid 2019

Genetics, origin, and evolution of SARS-COV-2

Therapeutics & Vaccines against SARS-COV-2

Health systems' capacity to deal with the CoVid 2019 pandemic

Non-pharmaceutical interventions against the CoVid 2019 pandemic

Diagnostics & Surveillance against the CoVid 2019 pandemic

Information sharing and Inter-sectoral collaboration against the CoVid 2019 pandemic

Ethical and social science considerations related to dealing with & combating against the CoVid 2019 pandemic

Resources

Paul Mooney's elaborative schema,

Breaking down the 'tasks' of COVID-19 Open Research Dataset Challenge (CORD-19): An AI challenge with AI2, CZI, MSR, Georgetown, NIH & The White House to the minute significant details will guide you further.

Twitter data analysis using (https://zenodo.org/record/3735274).

Potential tasks:

  • Identification of symptoms on Twitter users - Quantify how many users are claiming symptoms.
  • Characterize the information/misinformation around potential COVID-19 treatments using Twitter data
  • Identification of potential persons that have recovered - We only know the number of people that recover from hospitals, what about outside of them? Are people talking about this on Twitter?
  • Sentiment analysis towards particular regulations such as social distancing measures - How are these measures perceived over time in the Twitter space.

Collaborative Covid-19 literature annotation @ PubAnnotation

Since the LitCovid (by NCBI) and the CORD-19 (by Allen Institute for AI) datasets were released, many groups are producing and releasing annotations to the data set. We have setup an environment to collect and integrate those annotation datasets at PubAnnotation, a public repository of literature annotation, and are organizing collaborative annotation to the literature datasets of Covid-19. Production and collection of various annotation datasets is ongoing, and we are aiming at releasing a meaning amount of rich annotations in the end of the hackathon. Contribution with annotation datasets is completely open, and all the contributed annotation datasets will become immediately integrated and accessible, in various ways, including search, visualization, and fine-grained access.

Communication

Join the Slack workspace & head on to to 'text-mining-and-analysis' channel. It shall be fun.

Participants

Coordinator Ali Haider Bangash

Coordinator Yagoub A I Adam

Clone this wiki locally