COVID-19: How Data Science helps in battle against Coronavirus

The COVID-19 disease caused by RNA virus SARS-CoV-2 that most likely originated in December 2019 in Wuhan, China, has spread across the world causing a global pandemic, forcing many countries to impose massive lockdowns.

Self-quarantine at home, hand sanitizers, face masks, and Zoom video conferences have made everyone’s acquaintance. Frontline workers such as doctors, nurses, and paramedics have bravely been fighting the rampant virus. However, numerous professionals such as data scientists, biotech researchers, and computational biologists have been working tirelessly, behind the scenes, in order to understand the nature of the notorious virus and devise ways to combat it. Data science and AI have proved, yet again, to be a requisite in these testing times; their contribution towards first successful major breakthrough, circumvention of virus spread, information on affected cases, recognition of the underlying working mechanisms of the virus, and efforts to eliminate it are a few of the many ways Data Science and Artificial Intelligence is helping in the battle against the deadly malady of COVID-19. 


One of the most basic ways in which data science has aided in these challenging circumstances is by tracking and reporting the spread of the novel coronavirus, including covering up-to-date statistics of the number of cases, deaths and recoveries. This has been made possible by connecting interactive local databases with national data repositories, which provided real-time data. This data has then been utilized by healthcare workers to draw an accurate picture of the number of patients and death rates, as well as by the general public to stay informed of the risk in their respective localities. Furthermore, renowned institutes such as World Health Organization (WHO) have also contributed towards providing data for analysis. 

coronavirus, covid-19, data science, artificial intelligence, analytics, big data, machine learning, research, science, drug, discovery, vaccine, dashboard, pakistan
COVID-19 Case Tracking Dashboard, Government of Pakistan

Moreover, it has been a two-way thoroughfare, whereby such platforms exist that allow people to also input data into the system. COVID Symptom Tracker is one such data-sharing app developed by the health science company, Zoe. The app permits users to enter their signs and symptoms, and acts as a double benefit; one can verify whether he/she is a possible COVID-19 victim and any data entered can aid researchers in better understanding of the virus. A main issue regarding this disease is the similarity of its symptoms to other diseases such as hay fever, pneumonia, and influenza. This creates confusion amongst possible carriers who are unsure of whether to get tested for COVID-19 in Pakistan. Even more so, it is a source of dubiety amongst researchers and biologists who are toiling to differentiate it from other diseases in order to better tackle it. Therefore, data-sharing forums have become instrumental in detecting such minute variations between symptoms of these diseases. 


A pandemic caused by infectious agents such as SARS-CoV-2 necessitates the process of contact tracing. Contact tracing means the tracking of individuals who have been in recent contact with a person who has tested positive for the virus. Subsequently, these individuals are communicated with, usually via text, and then tested, isolated, and treated accordingly. In a world without data science, it would have been almost impossible to approach every patient’s recent contacts physically. Additionally, the magnitude of COVID-19 and its high rate of transmission have revealed inefficiency in pre-existing apps of contact tracing. This highlights that the link between data science and biology needs to be a stronger one in order to deal with novel health threats – the application of AI to biomedical sciences will revolutionize cures.

Various approaches to digital contact tracing have now been developed in an effort to evade virus spread. One of the approaches is known as The Decentralized Privacy-Preserving Proximity Tracing (DP-3T). This method uses Bluetooth generated codes from cell phones in order to create a contact web allowing people to be notified if they are at potential risk. Governments can also make use of digitally generated graphs which may contain surveillance data regarding the places visited by a person and the number of people encountered. One such surveillance database was created in China which helps track the movements of people through flight numbers and license plate numbers. Such a database proved to be a viable tool in tracing the spread of the virus and subsequently mitigating risk.


Collaboration between Data Scientists and Clinical Researchers is succor for the other
Collaboration between Data Scientists and Clinical Researchers is succor for the other

Moving on, perhaps the most groundbreaking partnership that arose from the pandemic is the one between data scientists and clinical researchers. Both these professions are currently working hand-in-hand and the contribution made by each is succor for the other. To elaborate, the coronavirus is not a newly-emerged virus, rather its strain SARS-CoV-2 is a novel mutation; this means that a rich amount of literature is already present on the virus which can aid in finding a cure. It is just painstaking and tedious for researchers to have to go through all the available information and publications and select critical factors which can assist in providing a cure. Not only is it an arduous task, it is also time consuming and the proliferation rates of this virus do not allow for years of research, which is usually the amount of time required to create and perfect a vaccine. Big Data has played a crucial role in overcoming this problem. Data-mining and Natural Language Processing (NLP) have allowed researchers to comb through required key terms in available resources and then only work with the relevant articles, hence also speeding up the process manifold. This is no small feat considering that the literature library related to COVID-19 is one of the largest ones in science, with approximately 4000 new publications submitted each week. 

An example is the database created by the White House in coalition with Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, IBM, and the National Library of Medicine. It is known as the COVID-19 Open Research Dataset (CORD-19) and is one of the largest databases to exist related to Coronavirus. Albeit useful, it has low functionality due to the overwhelming amount of data it contains. Jevin West is a data scientist who has provided a solution by creating a user-focused search tool, SciSight, which employs data-mining, enabling researchers to quickly scan and list articles relevant to their respective search words. It also displays connections between papers as browse-able maps. This means that a researcher, trying to clarify the symptoms of the virus, will not first have to go through thousands of articles focusing on other aspects. They will now, simply, be able to filter out all articles except those pertaining to molecular synthesis and interactions of the virus thus reducing trial time and boosting quality of research. 


Drug trials are time-intensive however, using Big Data, the process can be considerably shortened
Drug trials are time-intensive however, using Big Data, the process can be considerably shortened

Data science has been at the forefront of vaccine development, not only owing to Big Data, but also by allowing scientists to digitally visualize the structure and molecular basis of the virus. Under usual circumstances, a virus is first studied for many years and it is physically isolated in order to understand its composition. Various proteins and/or chemicals which are suspected to be potential antagonists are then allowed to interact with it and the subsequent reactions are noted, both long-term and short-term. For instance, medicines containing the active ingredient hydroxychloroquine, used to cure malaria, were at first thought to be a possible treatment for COVID-19. However, drug trials soon revealed that it may be ineffective

These trials are time-intensive, but with the partnership of data scientists, such algorithms were made which could study the interactions between various drugs and the virus. Many data science professionals including graduates from Columbia University have applied machine-learning to the process of uncovering anti-bodies. Machine learning can quickly analyze molecular associations to identify antibodies predicted to have high success rates – thus reducing years of wet lab work to a mere week.  


The world has been taken by storm and thrown into chaos by a microscopic entity. Perhaps, the only silver lining in this cloud would be the unmasking of anachronistic medical practices and the way they should be re-engineered. The part played by data science in the current situation will leave its mark on the future. A rise in AI-enabled drug development and computational biology is expected beyond the influence of COVID-19. An interdisciplinary approach to medicine with the consolidation of data science and biology is a requirement of present-day society. Telemedicine and contactless interactions are also presumed to prevail. The importance of Big Data and Predictive Analytics has been thrown into spotlight. It has been proven, yet again, that Artificial Intelligence is not merely applicable to robotics; rather, it is a powerful tool at our disposal that possesses the capability of transforming the world. Its clinical applications, through dealing with COVID-19, will linger on long after the indelible virus has left us. 

We hope you found it insightful. If you would like to discuss ideas, opportunities, and/or corporate trainings in data & analytics, please contact us:

· Konain Qurban, CEO, Vancouver/Karachi:

· Behjat Qurban, Managing Director, London:

· Sana J. Khan, Public Relations Director, Karachi:

We aim to bring highly buildable solutions to you, and to help you execute your strategies.


Leave a Reply

Your email address will not be published. Required fields are marked *