Science vs. COVID-19

Sean GourleyPosted by Sean Gourley

Science is one of humanity’s most important weapons in the fight against COVID-19. It is through science that we will understand how COVID-19 spreads through a population, how it interacts with the cells in our lungs, and how it jumped from animals to humans. It is also through science that we will create vaccines and treatments to stop this pandemic. Since the virus was first observed in China in December 2019, there have been over 2,500 articles written by over 8,000 authors in PubMed, arXiv, MedRxiv, and BioRxiv. In total, we estimate that this research represents hundreds of thousands of hours of work from some of the brightest minds in the world.

This body of research is remarkable, but the sheer volume of information makes it impossible for anyone to read it all, let alone for researchers to track progress day after day.

Ten days ago, I was brainstorming with our team at Primer about what we could do to support this scientific effort and we immediately saw an opportunity to take our internal tool for analyzing arXiv machine learning papers and apply it to scientific research on COVID-19. We spun up a small team who has been working around the clock to build this public resource we are releasing today:

Covid-19 Primer

This one-stop website tracks all the latest COVID-19 scientific research, as well as news coverage and social media discussions of this research. We created this as a resource to help all researchers, scientists, policymakers, and journalists get clarity into the science as it unfolds in real-time. automatically updates every 24 hours at 8am GMT with all of the new published research included in the analysis, generating a one-page daily briefing. It also generates a weekly briefing email that anyone can sign up for.

The website ingests all of the scientific papers about COVID-19 and analyzes it overnight. Our software classifies the papers into the top-level COVID-19 research categories prioritized by the White House call to action to the AI community. It extracts out all of the key authors and their affiliations (so you can look for research done by researchers at John Hopkins University, Chinese or British institutions, etc.), and extracts and defines the thousands of jargon terms that exist in the literature. This gives everyone a quick look-up guide to see, for example, that ARBs actually means Angiotensin Receptor Blockers, and all the associated papers that discuss this.

In addition to the top-level research categories we have also implemented a bottom-up method for finding emerging topics in the research, and to see which researchers are driving those topics forward. An increasing share of this research is in preprint form, so we have included both mainstream news media and social media content that links to these scientific papers in order to surface the online discussions about this latest research. If you don’t have time to engage directly with the scientific papers you can go to the newsfeed and read all the stories that talk about the new COVID-19 research.

The website is currently tracking all COVID-19 research going back to December when the first paper about the novel coronavirus emerged in Wuhan, China, including: • Over 8,000 scientific authors, along with all of their quotes that have been published in news articles • Over 2,000 COVID-19 research papers in PubMed, arXiv, bioRxiv, and medRxiv, with hundreds of new papers published daily • An auto-organized view of research progress on the top-level COVID-19 research categories prioritized by the White House call to action to the AI community • Emerging research topics, for example Cell Epitopes & Peptide-HLA, that are automatically detected and extracted • Over 200,000 tweets in which COVID-19 research is shared and discussed, which like these research papers are growing exponentially • An auto-generated glossary of over 1000 technical terms, growing daily

Here’s how you can help:

Share this website with your friends. The more people we have engaging with the scientific research around COVID-19 the better decisions the scientific community will make. There is a lot of noise out there about COVID-19—science gives us a much more grounded place to start.

Help us tag the data. The next step we’re undertaking is a large-scale tagging project to train machine learning models to identify key findings from the data. Sign up if you have experience with any of these research areas and want to help us tag data.

Help create Wikipedia pages. Many of these researchers don’t have Wikipedia pages. If you see a prominent researcher here without a page, it would be great if you could help create a page for them.

Please email us with feedback. If you see any mistakes in the classifiers or language generation, please email us. We will be bringing in UI elements to let users provide feedback directly from the product. Also, if there’s a news site or blog that is providing good coverage of this research and we don’t have it, please send it to us and we can update our data sources.