Study design and data source

Longitudinal study was conducted using secondary data drawn from two survey datasets: ‘Premise Malaria COVID-19 Health Services Disruption Survey 2020’ and a follow-up ‘Premise Malaria COVID-19 Health Services Disruption Survey 2021’. These surveys were designed to evaluate the change in levels of preventive measures for malaria and malaria service delivery throughout a specified period of the COVID-19 global pandemic. The surveys were designed by Institute for Health Metrics and Evaluation (IHME) in collaboration with Bill and Melinda Gates Foundation (BMGF), and implemented by Premise Data Corporation [19].

The surveys were conducted in 20 African countries were where malaria is endemic: Benin; Burkina Faso, Côte d’Ivoire, Democratic Republic of the Congo, Ethiopia, Ghana, Kenya, Liberia, Mali, Mozambique, Niger, Nigeria, Rwanda, Senegal, Sierra Leone, Somalia, Tanzania, Uganda, Zambia, and Zimbabwe. As the survey aimed to assess disruptions to malaria-related health services among the general population, no additional inclusion criteria applied beyond age (16 years and older) and country of residence in one of the 20 malaria-endemic African countries. Premise Data Corporation distributed the survey via its smartphone-based data collection platform. Premise maintains a global network of users with basic demographic data, including age range, gender, and country of residence. The survey was made available to all users within the Premise network, who were associated with any of the 20 target countries. Users were invited to participate through in-app notifications or prompts, and participation was voluntary. A smartphone-based Premise data collection platform was used for the data collection. Inclusion criteria were members of the general population in the 20 African countries who were aged 16 and above. Weights were not calculated for these surveys [19].

The ‘Premise Malaria COVID-19 Health Services Disruption Survey 2020’ was collected throughout July 2020 which provides information about malaria healthcare utilization for two time periods: from December 2019 to February 2020 (before COVID-19 outbreak in the various countries) and from March 2020 to June 2020 (early onset of COVID-19 outbreak in the various countries). In the follow-up survey, ‘Premise Malaria COVID-19 Health Services Disruption Survey 2021’, data were collected throughout May–June 2021 to give further insight into the changes in malaria healthcare utilization for February 2021–May 2021 (1 year after onset of COVID-19 pandemic in the various countries) [19].

Study variables and its definition

The outcome variable for this study was malaria healthcare service utilization, which included three specific measures: being tested for malaria, visiting a healthcare provider (HCP) for malaria symptoms, and receiving treatment for malaria symptoms. Survey questions related to malaria service utilization were organized into three distinct time periods: before COVID-19 (December 2019–February 2020), during COVID-19 onset (March 2020–June 2020), and 1 year after the pandemic onset (February 2021–May 2021). Explanatory variables included in this study were age, gender, education status, employment status, wealth status, geographical location, and region. Age was divided into four categories: 16–25 years, 26–35, 36–45, and over 45 years. The gender variable was coded as male and female. Education status was categorized into primary and lower, secondary and college, and above college. Work status included employed, students, and the unemployed. Financial situation was a dichotomous variable of whether participants were able to afford basic requirements or not. Geographic location was based on type of residence and included urban, rural, and suburban areas. Regional classification was done by grouping the 20 malaria-endemic countries into four sub-Saharan African regions. East Africa included Ethiopia, Kenya, Rwanda, Somalia, Tanzania, and Uganda; West Africa included Benin, Burkina Faso, Côte d’Ivoire, Ghana, Liberia, Mali, Niger, Nigeria, Senegal, and Sierra Leone; Central Africa included the Democratic Republic of the Congo; and Southern Africa included Mozambique, Zambia, and Zimbabwe. These variables were selected based on their theoretical relevance and prior research linking sociodemographic and regional factors to healthcare access and malaria outcomes.

Statistical analysis and model building

The data analysis was conducted using STATA version 18. Descriptive statistics were used to summarize the baseline characteristics of the study population and to present the distribution of malaria healthcare utilization across different categories of explanatory variables. The 2021 Premise Malaria COVID-19 Health Services Disruption Survey did not include questions regarding the reasons for not visiting a health provider or not taking malaria medication in response to symptoms. As the primary objective of this study was to assess trends in malaria-related service utilization over time, the absence of this information in the 2021 survey does not affect the core analysis or findings presented in the main findings. To account for the complex nature of the data, associations between repeated measures of the outcome variables and various factors were evaluated for statistical significance by applying regression models to each outcome variable using a Generalized Estimating Equation (GEE) approach. The GEE model incorporated a logit link function with a binomial family and an exchangeable working correlation structure. This method combined repeated measures for each explanatory and outcome variable across three phases ‘Pre-COVID-19 Phase,’ ‘Initial Phase of the COVID-19 Outbreak,’ and ‘1 Year After the COVID-19 Outbreak’ to estimate the population-averaged association between these time periods and the outcome variables. Since the outcome variable of this study is binary, a logistic regression model applied with the following structure:

$${\text{Logit}}\left( {{\text{Yit}}} \right)\, = \,{\text{B}}_{0} \, + \,{\text{B}}_{{1}} {\text{X}}_{{{\text{it}}}}$$

In this model, the logit (Yit) represents the log odds of the outcome Y for the i-th respondent across different time periods: ‘Pre-COVID-19 Phase,’ ‘Initial Phase of COVID-19 Outbreak,’ and ‘1 Year after the COVID-19 Outbreak.’ For dichotomous outcomes, the GEE model was structured as follows[20]:

$${\text{logit}}\left( {{\text{Y}}_{{{\text{it}}}} } \right)\, = \,{\text{B}}_{0} \, + \,{\text{B}}_{{1}} {\text{X}}_{{{\text{it}}}} \, + \,{\text{B}}_{{\text{m}}} {\text{Z}}_{{{\text{imt}} + }} {\text{B}}_{{\text{n}}} {\text{Z}}_{{{\text{in}}}}$$

In this equation, Zimt represents time-varying covariates, while Zin is a fixed covariate. In this study, the fixed covariates those assumed not to vary over time were age, gender, educational status, employment status, financial status, geographic location, and region. The coefficient B1 indicates the effect of different time periods on the outcome, accounting for the influence of additional covariates. Considering that GEE models for binary outcomes contain missing covariates, a technique that substitutes consistent estimates for missing values was used. In specific cases, this approach simplifies to a weighted GEE model using inverse probability weights. Extending the mean score method, it ensures consistent and asymptotically normal estimates under regularity conditions. To assess the effect of the COVID-19 pandemic on malaria health care utilization, Generalized Estimating Equation (GEE) models were employed to account for repeated measurements from the same individuals across survey periods. Both bivariable (crude) and multivariable GEE models were fitted for three outcome variables: being tested for malaria, visiting a healthcare provider for malaria symptoms, and receiving treatment for malaria. In the multivariable models, the following covariates were included to control for potential confounding: age group, gender, educational status, employment status, financial status, residential location, and geographic region. These covariates were adjusted for consistently across all three outcome models. Variables with a p-value of less than 0.05 in the multivariable GEE model were considered significantly associated with the outcome variables.

Ethical considerations

This study did not seek ethical approval for the Premise COVID-19 Health Services Disruption Survey 2020 and 2021 data, which are publicly accessible datasets stored in the IHME and are openly accessible for research purposes. Also, the data have been de-identified with no personally identifying information.