Simulating COVID-19 propagation at large scale

EpiGraph is an agent-based parallel simulator that performs realistic stochastic simulations of the propagation of the COVID-19 virus across wide geographic expanses. EpiGraph was originally designed in the Computer Architecture research group of University Carlos III and later, developed further with the collaboration of Barcelona Supercomputing Center. EpiGraph’s team is currently providing support to the Spanish Ministry of Health by means of the evaluation of different vaccination scenarios.

The current implementation of EpiGraph includes functionalities for modelling via a realistic interconnection network based on actual individual interactions extracted from social networks and demographical data. This network includes the characteristics of each individual, their relationships at work, school, home and during leisure time, and a transportation model which simulates the spatial dynamics of the virus’ propagation over-large scale areas. EpiGraph also includes a model of the interaction between COVID-19 spread and climate and meteorological factors, such as temperature, atmospheric pressure and humidity levels.




EpiGraph has been recently upgraded with new features that include: 


    • New social collectives including different professions (health, education, catering, etc.) and different elderly collectives (residing in nursing homes, living at home, etc.). Each collective has a particular social interaction pattern.
    • Social contact patterns based on contact matrices: the number and distribution of an individual’s contacts are age and profession-dependent.
    • Infectious agents: influenza or COVID-19, including its multiple COVID-19 variants (Wuhan, British strain, etc.).
    • Non-pharmacological interventions: use of different classes of face mask used by the whole population or specific collectives, and different social-distancing measures including school and work closures, lockdowns and travel restrictions.
    • COVID-19 vaccination: EpiGraph currently models and simulates the Pfizer-BioNTech, Moderna, Astra-Zeneca and  Janssen vaccines. 


Research areas

  • Analysis of the efficacy of different vaccination strategies.
  • Evaluation of the impact of propagation of the new COVID-19 strains taking into account different transmission rates and vaccine efficacies  for each variant.
  • Study of the efficiency of enforcement policies for slowing the spread of the epidemy.
  • Analysis of the impact of climate conditions on the epidemic outcome.
  • Assessment of different COVID-19 Testing Protocols.

Data management

Figure below shows the different data sources involved in a simulation. Epigraph consists of two main components: the scenario generation (upper part of the figure) that creates the scenarios and the simulator (lower part of the figure), that simulates the COVID-19 propagation on them. The input data sources used in the scenario generation are the city geolocation provided by web applications, that are used to identify the geographic coordinates of each city; its related NUTS code, as well as the distances between each pair of cities. The second data source are the Eurostat, and Spanish-equivalent INE, that provide the demographic data used by the simulator. This information includes -among other-, the population pyramid and the distribution of employment related to each city. Two different social-network graphs and contact matrices are used for generating the contact patterns of each individual.

Regarding the data sources involved in the simulation stage (lower part of the figure), the COVID-19 model parameters were taken from the existing literature. The non-pharmaceutical interventions (NPIs) applied in each region, the coronavirus incidence, and the vaccination data that are used to model the vaccine efficacy and the existing doses administrated in each region in Spain. This information was taken from the existing literature and government databases, respectively. Finally, the meteorological data consists of a collection of meteorological measurements.


Simulating the spread in Spain

EpiGraph is currently used to analyze COVID-19 propagation in Spain both at national and regional levels (Madrid metropolitan area). The following figures show the temporal distribution of real and simulated cases for the First Wave (on the left) and Third Wave (on the right) in Madrid. For the First Wave  the reported cases (shown in red) have been scaled based on the seroprevalence of SARS-CoV-2 infection in Spain on June 26th 2020. The red curve shown on the left corresponds to the number of infected individuals for the Madrid metropolitan area on June 26th. For the Third Wave, we have assumed that the reported cases correspond to the 70% of the overall number of cases. The real number of cases have been extrapolated from the reported ones using this scale factor.


Simulating the spread in Europe

The following figure shows the cities for which we have performed simulations in Europe. In total, there are 610 cities that correspond to the largest cities in Europe with an aggregated population of 198 million people. Each city was modeled using related information obtained from Eurostat and other European offices. We are currently evaluating different european scenarios with the simulator using the Tirant supercomputer at University of Valencia.


EpiGraph’s strengths for the analysis of COVID-19 expansion

  • EpiGraph models every single individual of the population. It is possible to include personal characteristics of each individual such as state of health, occupation, age, sex, existing pathologies, etc. and use them in the simulation process.
  • Distributed and scalable simulator. We implement a scalable, fully distributed simulator in MPI. Currently EpiGraph can be executed using hundreds of compute nodes and can perform simulations of a complete season for environments with hundreds of millions inhabitants. Currently we are performing European-scale simulations of COVID-19 propagation.
  • Vaccination and multiple COVID-19 strain modelling. Epigraph is currently able to simulate different COVID-19 vaccines that may have different efficacies depending on an individual’s characteristics and the COVID-19 strain. This permits the simulation of existing vaccination scenarios for Europe where some vaccines are selectively applied to certain collectives.
  • Realistic simulations. We have validated the results of the simulations with other simulators as well as real data obtained from NYSDOH and influenza surveillance data obtained from the SISSS (Spanish Influenza Sentinel Surveillance) System corresponding to the 2010-2011 influenza season in Spain.
  • Study of different scenarios. EpiGraph permits the simulation of time-dependent R0 values. As such we are able to analyze the effect of changes in climate on COVID-19 propagation, for example the repercussions of warmer climate conditions (related to the arrival of the spring) on virus spread. In addition, we analyse and compare the impact of different potential vaccination policies on managing the virus’ dissemination process.


The team coordinated by David Expósito Singh comprises groups from University Carlos III de Madrid (UC3M), Barcelona Supercomputing Center (BSC), National Centre for Epidemiology (NCE) and Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP)

  • At UC3M: David Expósito Singh, Jesús Carretero, Miguel Guzmán Merino, Christian Durán González, Alberto Cascajo García
  • At BSC: María-Cristina Marinescu
  • At NCE and CIBERESP: Amparo Larrauri, Diana Gomez-Barroso and Concepción Delgado-Sanz

Former members

  • Diego Fernández Olombrada, Florin Isaila, Gonzalo Martín


This project has been funded by Instituto de Salud Carlos III, Ministry of Health and Innovation under the COV20/00935, the Ministry of Health under the contract Development of a tool for prediction of epidemiological scenarios and vaccination against COVID-19, and the Spanish Supercomputing Network (RES) under projects BCV-2020-3-0008, BCV-2021-1-0011, BCV-2021-2-0011 and BCV-2021-3-0007 . Additionally, this work has been partially funded by the Spanish Ministry of Science and Innovation Project New Data Intensive Computing Methods for High-End and Edge Computing Platforms (DECIDE). Ref. PID2019-107858GB-I00. We also would like to thank Spanish Meteorological Agency (AEMET) for providing meteorological data for Spain.