DOI 10.22394/1726-1139-2022-7-163-184
The Impact of Health Expenditure on COVID-19 Mortality
Mariia A. Ovsiannikova
HSE Campus in St. Petersburg, Saint Petersburg, Russian Federation; maovsyannikova_1@edu.hse.ru ABSTRACT
The present study investigates the degree to which countrywide health expenditures as a measure of pandemic preparedness reduce mortality from COVID-19, using data on 96 countries of the world. A statistically significant negative effect of higher health expenditure on expected mortality is found for low-income countries. This effect for middle- and high-income countries is insignificant. Leading threats to the internal validity of this study are omitted variable bias and sample selection bias. Some ways in which this study can be built upon are suggested.
Keywords: pandemic, health expenditure, behavior, mortality, countries, income level.
For citing: Ovsiannikova M. A. The Impact of Health Expenditure on COVID-19 mortality // Administrative consulting. 2022. N 7. P. 163-184.
Влияние расходов на здравоохранение на смертность от COVID-19
Овсянникова М. А.
Национальный исследовательский университет «Высшая школа экономики» Санкт-Петербург, Российская Федерация; maovsyannikova_1@edu.hse.ru
РЕФЕРАТ
В данной статье исследуется степень, в которой общенациональные расходы на здравоохранение (в качестве меры готовности к пандемии) снижают смертность от COVID-19. Для проверки гипотез используются данные по 96 странам мира. В странах с низким уровнем дохода обнаружено статистически значимое негативное влияние увеличения расходов на здравоохранение на ожидаемую смертность. Этот эффект для стран со средним и высоким уровнем дохода незначителен. Основными угрозами внутренней валидности этого исследования являются пропущенная переменная и смещение выборки. Предлагаются некоторые способы, на которых можно построить дальнейшее исследование.
Ключевые слова: пандемия, расходы на здравоохранение, поведение, смертность, страны, уровень дохода
Для цитирования: Овсянникова М. А. The Impact of Health Expenditure on COVID-19 mortality // Управленческое консультирование. 2022. № 7. С. 163-184.
1. introduction
In this research we aim to evaluate the degree to which health expenditure prevents deaths in the event of a global emergency such as the COVID-19 pandemic using multiple linear regression model. The relevance of the chosen research question is hard to understate. Indeed, it would be of interest for country-level and international policy-makers alike, as well as for the general public, to know whether spending on health is really worth it.
We use publicly available data in our calculations and control for a range of variables that may affect mortality from COVID-19, including measures of availability and quality of healthcare, public attitudes and behaviors during the pandemic, and several demographic characteristics. Full list of variables with their sources and intuition behind including them in the model is presented in Table 1. In short, mortality from COVID-19 (cumulative total
< m
<
Table 1
Variables and Definitions
Variable Definition Type Intuition Source
Mortality Deaths — cumulative total per 100,000 population (Retrieved 13.10.2021) Continuous Dependent variable WHO Corona-virus (COV-ID19) Dashboard1
Che Level of current health expenditure expressed as a percentage of GDP. (2018). Estimates of current health expenditures include healthcare goods and services consumed during each year. This indicator does not include capital health expenditures such as buildings, machinery, IT and stocks of vaccines for emergency or outbreaks Continuous (between 0 and 100) Variable of interest. As it is the source of financing for the country's health system, higher health expenditure could mean better health system, greater preparedness for the pandemic, and consequently, lower mortality. World Bank Open Data2
Beds Hospital beds (per 1,000 people). (2017) Continuous More beds could mean greater preparedness for the pandemic (and lower mortality). It is likely both to be correlated with che and have an effect on mortality Bank Open Data
per 100,000 population as of October 2021) is the dependent variable, and che (current health expenditure, % of GDP, 2018) is the explanatory variable of interest. Along with beds, doctors, nurses, and dghe, it could indicate greater preparedness for the pandemic, better healthcare system, and as such, lower mortality. Higher measures of a country's citizens' behavior (beh_.), attitudes (fob_.), and government trust in the first months of the pandemic itself could also mean lower mortality from COVID-19, unlike higher shares of urban population (urban) and population over 65 (pop65), which might lead to increased mortality rates. Complete cases are available for 96 countries; the data are as recent as possible. Admittedly, there are still variables which could be further included in the model, but for which we were unable to find adequate data, for example, availability of training in medical emergency for medical personnel, quality of ambulance services, etc.
One concern would be inadvertently replicating existing research in both research question and methodology. To the best of our knowledge, there is no such issue with our current specifications. Khan et al. despite an overlap in some (but not all) of the data,
Variable Definition Type Intuition Source
Pop65 Population ages 65 and above (% of total population). (2019). Population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship Continuous (between 0 and 100) As elderly people face a greater risk of severe COVID19 cases and comorbidities [7, p. 16-25], greater proportion of people over 65 could mean greater mortality rates Bank Open Data
Popdens Population density (people per sq. km of land area) (2019) Continuous Higher population density could mean greater risk of contagion, more COVID-19 cases and greater mortality Bank Open Data
Urban Urban population (% of total population). (2019). Urban population refers to people living in urban areas as defined by national statistical offices. The data are collected and smoothed by United Nations Population Division Continuous (between 0 and 100) People in cities may face higher risk of contagion (and higher mortality) due to population density; however, people in rural areas are more vulnerable in terms of access to timely healthcare Bank Open Data
Dphe Domestic private health expenditure (% of current health expenditure). (2018). Domestic private sources include funds from households, corporations and nonprofit organizations. Such expenditures can be either prepaid to voluntary health insurance or paid directly to healthcare providers Continuous (between 0 and 100) Private health expenditure making up a relatively large share of current expenditure (and government health expenditure making up a small share) could mean that a lot of health expenses households have to cover out-of-pocket; i. e., not many health services are guaranteed by the government. That is, there are barriers to health care access, and preparedness for the pandemic is relatively low Bank Open Data
Variable Definition Type Intuition Source
Dghe Domestic general government health expenditure (% of current health expenditure) (2018) Continuous (between 0 and 100) See above. The third category in current health expenditure is external expenditure Bank Open Data
Tobacco Prevalence of current tobacco use (% of adults) (2018). Continuous (between 0 and 100) As tobacco is a well-recognized cause of severe COVID-19 cases [3, p. 106233], higher prevalence of tobacco use could mean higher mortality Bank Open Data
Procur Procurement of medical devices carried out at the national level (Latest year) Binary (Yes=1, No=0) A measure of quality of the national health system. Probably countries that procure medical devices at the national level are better prepared for the pandemic The Global Health Observatory Indicators3
Doctors Medical doctors (per 10,000) (Latest year) Continuous A measure of health system capacity. More doctors could mean more adequate care and less fatalities The Global Health Observatory Indicators
Nurses Nursing and midwifery personnel (per 10,000) (Latest year) Continuous A measure of health system capacity. More nurses could mean more adequate care and less fatalities The Global Health Observatory Indicators
Beh_ stayhome Question asked to individuals in spring of 2020 was: "To what extent do the following statements describe your behavior for the past week? [0=Does not apply at all; 100=Applies very much] I stayed at home. We took the average of responses by country and selected countries with no less than 20 respondents Continuous (between 0 and 100) The success of emergency measures taken depends on the nation's attitudes and behaviors; the better people followed recommendations; the less people could have died from COVID-19 Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Variable Definition Type Intuition Source
Beh_ socgath- ering ...I did not attend social gatherings continuous (between 0 and 100) The same intuition as for "beh_stayhome" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Beh_ distance ...I kept a distance of at least two meters to other people continuous (between 0 and 100) The same intuition as for "beh_stayhome" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Beh_ tellsymp ...If I had exhibited symptoms of sickness, I would have immediately informed the people around me continuous (between 0 and 100) The same intuition as for "beh_stayhome" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Beh_ hand-wash ... I washed my hands more frequently than the month before continuous (between 0 and 100) The same intuition as for "beh_stayhome" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Fob_ social "What do you think: should people in your country cancel their participation at social gatherings because of the coronavirus right now? [No = 0; Yes=1]" We took the average of responses by country (getting the percentage of people who said Yes) and selected countries with no less than 20 respondents continuous (between 0 and 1) Another way to look not on people's actions, but on their beliefs on whether recommendations are reasonable or not Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Fob_ handshake "What do you think: should people in your country not shake other people's hands because of the coronavirus right now? [No=0; Yes=1]" continuous (between 0 and 1) The same intuition as for "fob_social" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Variable Definition Type Intuition Source
Fob_ stores "What do you think: should all shops in your country other than particularly important ones, such as supermarkets, pharmacies, post offices, and gas stations, be closed because of the coronavirus right now? [No=0; Yes=1]" continuous (between 0 and 1) The same intuition as for "fob_social" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Fob_ curfew "What do you think: should there be a general curfew in your country (with the exception of grocery shopping, necessary family trips, and the commute to work) because of the coronavirus right now? [No=0; Yes=1]" continuous (between 0 and 1) The same intuition as for "fob_social" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Perceived reac-tion_d "Do you think the reaction of your country's government to the current coronavirus outbreak is appropriate, too extreme, or not sufficient? [5-point scale; 1=The reaction is much too extreme; 2=The reaction is somewhat too extreme; 3=The reaction is appropriate; 4=The reaction is somewhat insufficient; 5=The reaction is not at all sufficient]" We converted the categorical variable into binary (4,5=1, 1,2,3=0) and aggregated as before Continuous (between 0 and 1) Stronger civil responsibility and trust in the government's actions of the populace when emergency measures are taken could mean less deaths from COV-ID-19 Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Variable Definition Type Intuition Source
Gov-trust_d How much do you trust your country's government to take care of its citizens? [5-point scale; 1=Strongly distrust; 2=Some-what distrust; 3=Neither trust nordistrust; 4=Somewhat trust; 5=Strongly trust] Aggregated as above Continuous (between 0 and 1) The same intuition as for "perceived reaction_d" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Gov-fact_d How factually truthful do you think your country's government has been about the coronavirus outbreak? [5-point scale; 1=Very untruthful; 2=Somewhat untruthful; 3=Nei-ther truthful nor untruthful; 4=Somewhat truthful; 5=Very truthful] Continuous (between 0 and 1) The same intuition as for "perceived reaction_d" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Perceived effective ness_d What do you think: How effective are social distancing measures (e. g., through a general curfew) to slow down the spread of the coronavirus? [5-point scale; 1=Not at all effective; 2=Not effective; 3=Neither effective nor ineffective; 4=Ef-fective; 5=Very effective] Continuous (between 0 and 1) The same intuition as for "perceived reaction_d" Global Behaviors and Perceptions in the COV-ID-19 Pandemic
Region WHO Region (Americas, Europe, Western Pacific, Eastern Mediterranean, South-East Asia, Africa) Categorical Extra control variable to account for geographical influences WHO Corona-virus (COV-ID19) Dashboard
< tEnd cf Table 1
m
Variable Definition Type Intuition Source
Population Total population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship. (2019). The values shown are midyear estimates Continuous Extra control variable to help with the relative data described above (percentages, etc.) World Bank Open Data
In- comelvl Economies are divided among income groups according to 2019 gross national income (GNI) per capita, calculated using the World Bank Atlas method. (2019). The groups are: low income (LIC), < $1,035; lower middle income (LMC), $1,036 — 4,045; upper middle income (UMC), $4,046 — 12,535; and high income (HIC), > $12,536 Categorical Extra control variable to approximate the countries' level of development World Bank DataBank4
1 [Electronic source]. URL: World Health Organization. WHO Coronavirus (COVID-19) Dashboard (2021). Retrieved from https://covid19.who.int/table (accessed: 12.04.2022). 2 [Electronic source]. World Bank, World Bank Open Data. (n. d.). Retrieved from https://data. worldbank.org/indicator (accessed: 12.04.2022). 3 [Electronic source]. URL: World Health Organization. The Global Health Observatory Indicators (n. d.). Retrieved from https://www.who.int/data/gho/data/indicators/ indicators-index (accessed: 12.04.2022). 4 World Bank DataBank. List of economies (2020) [Electronic source]. URL: Retrieved from https:// databank.worldbank.org/data/download/site-content/ CLASS.xls (accessed: 12.04.2022).
use negative binomial regression and put more of an emphasis on healthcare capacity [5, p. 347]. Oshinubi et al. use both linear and exponential models and analyze the impact of current health expenditure on the reproduction number R0 of COVID-19 instead of mortality rate [6, p. 1247]. Kapitsinis uses multiple linear regression to study mortality from COVID-19 across regions and explain its underlying factors, but limits his sample to the regions of nine EU countries [4, p. 1027-1045]. Elola-Somoza et al. take into account only Spain and Europe and calculate only Pearson's correlation coefficient between the public health expenditure per capita and the mortality rate due to COVID-19 [1, p. 400-403].
2. Exploratory data analysis
Now we turn to discussing the data collected in more detail. First, we describe data processing along with the features of the data themselves. Then, we look at how well this data can answer our research question.
(1) Data
The data used in this research come from their respective sources (see Table 1). Due to these sources being under the jurisdiction of separate entities, there was some mismatch in the country names and/or the list of countries available in different sources. When at all possible, the retrieved datasets were merged by three-letter country codes, otherwise the inconsistencies in country names were adjusted manually before merging. As noted also in Table 1, the years selected were generally the latest available, with several notable conditions to their selection: (1) that the features of health systems were taken to be before the pandemic, even if more recent data were available (the reason being our focus on the pandemic preparedness rather than the pandemic response);
(2) that several variables (e. g. nurses) consisted of the latest observations for their respective countries (not necessarily of the same year), in the interest of maximizing the list of countries for which such information was obtainable. Note, however, that the most recent data on che, the variable of interest, is for 2018.
Then, only complete observations were selected, and only those for which there were no less than 20 respondents in the Global Behaviors and Perceptions survey1 (the prior aggregation of the data of this survey is detailed in Table 1) [2, p. 77]. The resulting dataset contains data on 60 countries.
(2) Statistical Analysis
In order to make sense of possible outliers due to errors in variables one may look at descriptive statistics presented in Table 2. For example, one can observe that domestic general government health expenditure (dghe) on average comprises more than half of current health expenditure; 52% of countries available procure medical devices at the national level (procur ); the scores of the residents' behavior and attitudes are usually very high (possibly due to the respondents' exaggeration, which is not, however, the subject of this study), unlike the scores detailing the evaluation of government trust (which are middling), etc. Generally, all variables are within their expected ranges.
Turning to the only categorical variable in our analysis region, it is easy to see that while only African region is absent from the analysis, there are very few observations in three of the remaining regions (Table 3, Fig. 1). It is logical to aggregate them into Other (Table 4). Still, the distribution of mortality varies depending on the category, as is evident from Fig. 1. By contrast, the medians of mortality distributions depending on the two values of the only binary variable procur seem close enough, though variances differ (Fig. 2).
Extra attention should be paid to the variables that correlate with mortality, the dependent variable, and che, the variable of interest. As we face the trade-off between bias and variance of the coefficient of interest, we will likely include only some of the variables in our model, so having solid reasons to do so is a good thing.
We made the scatter plots of all other numerical variables against mortality, and they paint an alarming picture in the sense that there is a very nebulous observable relationship, if at all. We presumed that the reason for this unobservability of linear, let alone nonlinear, relationship was the noise attributed to the small number of observations in our dataset (n = 60). The only remedy for this is returning the data to the drawing board and reconstructing the latest from scratch, trying to wrangle out more observations in the process.
It is important to keep in mind that while various sources provide similar lists of countries, the number of observations available varies. The cause of shrinking number of complete observations, therefore, is twofold: at least one of the 27 variables not having data for a country (causing the country to be omitted), and our selection of
1 Fetzer T., Witte M., Hensel L., Jachimowicz J. M., Haushofer J., Ivchenko A., Caria C., Reutskaja E., Roth C., Fiorin F., Gomez M., Kraft-Todd G., Goetz F., Yoeli E. Global Behaviors and Perceptions in the COVID-19 Pandemic [Electronic source]. URL: https: //osf.io/3sn2k (accessed: 12.04.2022).
Descriptive Statistics (n=60)
Variable Mean Sd Min Q1 Median Q3 Max
mortality 134.62 106.71 0.39 50.64 125.66 204.73 605.68
che 7.62 2.58 2.50 5.69 7.54 9.26 16.89
beds 3.87 2.51 0.63 2.20 3.12 5.13 13.05
pop65 14.78 5.96 1.52 9.13 15.59 19.76 28.00
popdens 296.44 1043.79 3.58 34.63 99.85 218.33 8044.53
urban 75.39 16.11 18.59 66.55 79.73 87.03 100.00
dphe 35.83 14.43 13.67 25.91 34.44 48.15 70.60
dghe 63.96 14.56 28.73 51.75 64.81 74.09 85.32
tobacco 24.25 8.77 7.90 18.32 23.50 28.88 44.70
procur 0.52 0.50 0.00 0.00 1.00 1.00 1.00
doctors 33.81 16.05 4.65 23.62 32.37 43.60 80.13
nurses 69.25 50.03 2.80 26.46 61.65 102.38 216.70
beh_stayhome 83.46 7.33 58.68 81.24 84.83 87.91 94.41
beh_socgathering 92.33 4.99 75.30 90.95 94.35 95.56 99.00
beh_distance 78.76 9.12 47.87 74.07 81.32 85.27 90.56
beh_tellsymp 92.85 4.28 78.45 92.82 94.26 95.09 97.72
beh_handwash 91.69 2.83 83.72 90.47 92.06 93.75 96.57
fob_social 0.98 0.04 0.79 0.98 0.99 0.99 1.00
fob_handshake 0.97 0.04 0.74 0.96 0.98 0.99 1.00
fob_stores 0.81 0.17 0.20 0.77 0.87 0.91 0.97
fob_curfew 0.71 0.19 0.16 0.59 0.74 0.88 0.99
perceivedreaction_d 0.40 0.23 0.00 0.23 0.36 0.56 0.91
govtrust_d 0.57 0.24 0.09 0.38 0.58 0.80 0.96
govfact_d 0.63 0.24 0.09 0.49 0.71 0.82 0.98
perceivedeffecti- 0.89 0.05 0.70 0.85 0.90 0.93 0.97
veness_d 64.587 187.688 0.361 5.428 10.730 47.935 1397.715
population (mln.)
Table 3
Region: № of observations by category
Region Number of obs.
Americas 12
Eastern Mediterranean 7
Europe 33
South-East Asia 2
Western Pacific 6
countries for which there were no less than 20 respondents in the Global Behaviors and Perceptions survey [2, p. 77]. While the latter we believe to be reasonable, the former we can inspect closer, as we have plenty of variables. As a result of such an inspection, the variables causing the most "shrinking", namely beds, tobacco, and procur were got rid of, which increased the sample by more than 50% (n = 96). At this point, the number of observations is greater than in Khan et al. for example (they study 86 countries), which we deemed satisfactory [5, p. 347].
400
£ 13 ■E
o
S
200
Eastern „ Western
Americas Mediterranean EuroPe South-East Asia paciflc
region
Fig. 1. Mortality distributions by WHO Region
600
400
200
procur
Fig. 2. Mortality distributions by Procurement of medical devices carried out at the national level (1 = Yes, 0 = No)
< Table 4
EE Region: № of observations by category (with other)
Region Number of obs.
Americas 12
Europe 33
Other 15
We will now provide a short overview of the exploratory data analysis we repeated for the new sample. Judging by the descriptive statistics (Table 5), all variables are again within their expected ranges. There are a few improvements in the new dataset, however: (1) there are now data on Africa available, completing the set of the WHO Regions (Table 6, Fig. 3); (2) the conclusions about the relationship between mortality and the explanatory variables, despite possibly contradicting our expectations in some cases (Fig. 4-5), are expected to hold up better due to the asymptotic nature of various hypothesis tests. We plan to use the updated dataset from this point onwards.
Table 5
Descriptive Statistics (n=96)
Variable Mean Sd Min Q1 Median Q3 Max
mortality 118.11 101.07 0.39 30.93 105.14 184.11 605.68
che 7.00 2.52 2.34 5.28 6.88 8.66 16.89
pop65 12.20 6.67 1.16 6.43 12.11 18.75 28.00
popdens 268.42 856.98 3.30 46.36 100.05 220.68 8044.53
urban 68.47 20.49 17.31 57.00 71.19 83.75 100.00
dphe 39.80 16.24 11.96 26.60 39.48 49.72 77.27
dghe 57.86 18.33 14.87 45.25 59.59 73.16 88.04
doctors 27.95 17.93 0.60 12.66 26.29 40.52 80.13
nurses 58.15 46.94 2.80 19.03 51.52 74.27 216.70
beh_stayhome 82.68 8.71 48.36 79.06 84.24 88.50 95.72
beh_socgathering 91.26 5.79 70.30 88.16 93.19 95.55 99.30
beh_distance 76.33 9.91 47.87 69.79 77.60 84.31 92.07
beh_tellsymp 92.31 4.67 78.45 90.60 93.77 95.01 99.29
beh_handwash 91.49 3.29 80.60 90.18 91.97 93.75 97.92
fob_social 0.98 0.03 0.79 0.98 0.99 0.99 1.00
fob_handshake 0.96 0.04 0.66 0.96 0.97 0.98 1.00
fob_stores 0.81 0.14 0.20 0.78 0.86 0.90 0.97
fob_curfew 0.75 0.18 0.16 0.65 0.78 0.89 1.00
perceivedreaction_d 0.40 0.24 0.00 0.21 0.37 0.56 0.95
govtrust_d 0.54 0.25 0.04 0.37 0.52 0.77 0.96
govfact_d 0.60 0.24 0.09 0.40 0.65 0.80 0.98
perceivedef 0.07 0.62 0.84 0.89 0.92 1.00
fectiveness_d 0.87
population (mln.) 69.54 201.818 0.361 5.658 12.161 53.932 1397.715
Table 6
region: № of observations by category (n = 96)
Region Number of obs.
Africa 7
Americas 18
Eastern Mediterranean 13
Europe 43
End of Table 6 <
Region Number of obs.
South-East Asia 6
Western Pacific 9
ft 400
sh
o
200^
... . . Eastern „ South-East
Africa Americas„ ... üurope .
Mediterranean Asia
Western Pacific
region
Fig. 3. Mortality distributions by WHO Region (n=96)
13
■e
700 600 500 400 300 200 100 0
9
•• •
• • Vv, » • *
••• •
• • • # -
200 400 600
popdens
800
1000
Fig. 4. Mortality vs. population density (over a limited range of density) (n=96)
3. Model Estimation
We will use multiple specifications in order to estimate the expected effect of a (hypothetical) change in current health expenditure (measured as a percentage of a country's GDP in 2018) on mortality from COVID-19 (measured as cumulative total per 100,000 population) in a country, holding all else constant. As there are numerous
700 600 500 5 400
w -(J h
I 300 200 100 0
9
<1 «ft
• 0 C • 9 •
(&f * • • • • •
ImL m m • • • • • •
0
50
100
150
200 millions
Fig.
population (mln.)
5. Mortality vs. population (over a limited range of population) (n=96)
variables that potentially affect mortality and are correlated with the level of current health expenditure (see Table 1), it is necessary to include such control variables in the model to avoid omitted variable bias. For this reason, our base specification includes che as well as the 6 control variables:
mortality = P0 + • che + P2 • pop65 + P3 • urban + P4 • doctors + P5 • nurses + + P6 • dghe + P7 • popdens + u
The descriptions of the variables included can be found in Table 1. The control variables are those which both economic intuition (occasionally shared by authors of the articles considered in Introduction) and significant correlation coefficients observed in data suggest as remedies to omitted variable bias.
We must also construct specifications which take into account regional variation as well as behavioral variables from the Global Behaviors and Perceptions survey some of which are clearly correlated with current health expenditure [2, p. 77]. Finally, nonlinear effects are always worth considering. It could be noted at the outset that both mortality and che (as well as several other variables measured as percentages) are already measured in relative terms; therefore, it makes no sense to take logarithms of them.
4. Results and Discussion
(1) Multiple regression results
Table 7 summarises the results of OLS regressions of mortality on various sets of regres-sors, of which che is the regressor of interest. All the other regressors are controls used to minimize potential bias in the OLS estimate for the expected effect of current health expenditure before the pandemic on mortality, ceteris paribus. As such, the coefficients on control variables' being significantly different from zero is not our main concern. The
oLS Regression results (n = 96)
Dependent variable:
Variable mortality
(1) (2) (3) (4) (5)
che 12.060*** (2.992) 6.468 (4.319) 0.623 (6.196) -4.451 (7.279) -19.426** (9.694)
pop65 2.919 (2.525) 4.255 (2.977) 6.052* (3.318) 8.243** (3.352)
urban 0.537 (0.755) 0.019 (0.683) 0.424 (0.738) -0.346 (0.761)
doctors 0.979 (0.966) 0.167 (0.851) 0.383 (0.801) 0.568 (0.678)
nurses -0.622** (0.291) -0.415 (0.266) -0.142 (0.269) 0.283 (0.233)
dghe 0.377 (0.602) 0.196 (0.527) -0.474 (0.601) -0.351 (0.547)
popdens -0.018** (0.008) -0.004 (0.006) -0.008 (0.007) -0.002 (0.005)
region (base: Africa) Americas 138.996*** (41.076) 109.000*** (39.809) 69.798** (31.886)
Eastern Mediterranean 37.900 (32.463) -16.688 (40.335) -30.269 (26.359)
Europe 77.383** (37.860) 42.148 (43.650) -11.910 (42.038)
South-East Asia -2.281 (25.909) -35.179 (37.191) -58.488 (39.326)
Western Pacific -36.447 (35.140) -57.323 (38.304) -87.725** (35.681)
beh_stayhome 3.339* (1.792) 3.769** (1.858)
beh_socgathering -4.828** (2.184) -4.392** (2.232)
beh_distance 1.515 (1.080) 1.729 (1.244)
beh_tellsymp 1.267 0.057
< End of Tables 7
Dependent variable:
Variable mortality
(1) (2) (3) (4) (5)
-(1.609) (1.549)
beh_handwash -4.285* (2.389) -4.426* (2.458)
fob_curfew 70.427 (73.735) 74.182 (58.388)
incomelvl (base: LIC) LMC -141.410** (58.606)
UMC -73.648 (69.091)
HIC -40.095 (77.346)
che x incomelvlLMC 25.838** (12.279)
che x incomelv-lUMC che x incomelvlHIC 23.690** (11.772) 8.988 (9.929)
Observations 96 96 96 96 96
Note: *p < 0.1; **p < 0.05; ***p < 0.01 Robust standard errors in parentheses
effect of adding more (relevant) control variables on the value and significance of the coefficient on che, by contrast, interests us very much.
Consider regression (1). The estimated effect of che on mortality is, unexpectedly, positive and statistically significant at the 1% level. Recall that our intuition suggested that higher share of current health expenditure in GDP, translating into better quality of health systems, would in fact reduce future mortality from COVID-19. Adding the controls included in our base specification (regression (2)), however, cuts that effect almost in half, rendering it insignificantly different from zero. Controlling for regional differences in current health expenditure (regression (3)) further reduces the absolute value of the coefficient of interest, though it remains positive. Regression (4), using behavioral variables that highly correlate with che and mortality as controls, shows a change in the sign of the coefficient on che to the one expected. This negative effect of current health
expenditure on expected mortality, all else being equal, is still insignificantly different from zero.
There is a case to be made for the coefficient p1 on che being overestimated (it is overestimated to a lesser extent with the introduction of extra control variables, yet some overestimation may persist). Suppose that the mortality numbers are affected by the quality of reporting cases and deaths, which developed countries might be able to track more accurately due to the superior quality of institutions. Suppose also that developed countries have higher shares of current health expenditure in GDP. If this is so, pi will be positively correlated with the error term, and hence upward biased. The endogeneity of che stems from che and mortality being "choice" variables of the same country (the choice of mortality is, of course, not literal). This is similar to the widely known models regressing wage on education (with unobserved ability).
One solution would be to use instrumental variables regression, but in our case, the quality of a country's institutions and/or the country's development is at least somewhat observable and can be approximated. The broad categories of region are a possible, yet imprecise, measure of development: e. g., the countries of Europe are known to vary in their levels of development. Instead, we use the World Bank country classification by income level as the simplest proxy for development (incomelvl). Table 1 has been updated with the description of this new control variable; Table 8 and Fig. 6-7 provide some exploratory data analysis, which, incidentally, supports the hypothesis that pi could be overestimated (see above).
Regression (5) incorporates a nonlinear effect into the model, namely the interaction of che and incomelvl, in addition to the linear effect of incomelvl. The resulting equation indicates that increasing the share of current health expenditures in GDP by 1 percentage point would reduce the expected mortality per 100,000 population by 19.426 in low-income countries and by 10.438 in high income countries, all things being equal. By contrast, the same effect on mortality in lower- and upper middle-income countries turns out to be positive with the absolute values of 6.412 and 4.264, respectively.
To determine the significance levels, three additional Wald tests were carried out to test the hypothesis: "Holding all else constant, the effect of increasing the share of current health expenditures in GDP by 1 percentage point on mortality from COVID-19 in [insert income level] countries is significantly different from zero." (note that for low-income countries, the value of the effect is equal to the coefficient on che, so its significance can be inferred directly from Table 1). The results of the tests are summarized in Table 9.
We found that greater proportion of health expenditure in GDP may reduce mortality in low-income countries. This is a step up from our results in regressions (1)-(4), yet this verdict is a tentative one due to the few low-income countries included in the sample. The effect of current health expenditure on expected mortality in middle- and high-income countries, all else being equal, remains insignificant.
Table 8
incomelvl: № of observations by category (n = 96)
incomelvl Number of obs.
LIC 4
LMC 18
UMC 28
HIC 46
LIC LMC UMC HIC
incomelvl
Fig. 6. Mortality distributions by income level
LIC LMC UMC HIC
incomelvl
Fig. 7. Che distributions by income level
The insignificance of these latter effects (despite their large absolute values) could be explained by the small number of observations (n = 96) typical to the objects of our study (countries). The insignificance of these effects (compared to the significant effect
Hypothesis Tests Results
Income level Ho test p-value Conclusion
LIC che = 0 z-test (two-sided) 0.0451 The effect is statistically significant at 5% level.
LMC che+chex incomelvLMC = 0 Wald test 0.6143 The effect is statistically insignificantly different from zero.
UMC che + che x incomelvUMC = 0 Wald test 0.7376 The effect is statistically insignificantly different from zero.
HIC che + che x incomelvHIC = 0 Wald test 0.1064 The effect is statistically insignificantly different from zero, although close to being significant at the 10% level.
in low-income countries) could be explained in the following way: suppose the effect of che on mortality (both expressed in relative terms) depends on the initial value of che; then in middle- and high-income countries, where che is usually larger, this marginal effect is smaller. The unexpectedly different signs on these effects, though, suggest that there might be some yet unexplained patterns in these countries, which are not picked up by the model as it is. Here, not much is accomplished by controlling for incomelvl only: perhaps accounting for the varying reactive measures taken in these countries (in addition to the state of preparedness) or choosing a more faceted proxy of development would help quantify these patterns more adequately.
(2) Discussion of internal and external validity
There is limited opportunity to analyze the external validity of this study due to the nature of the objects studied (countries of the world). Two immediate things can be done, however. First, we can replicate regressions (1)-(3) on the larger sample of 160 countries. Regressions (4)-(5) are impossible to replicate, as they use behavioral variables as controls, which in turn rely on the number of participants of the Global Behaviors and Perceptions survey in each country being no smaller than 20. Nevertheless, if the results of regressions (1)-(3) for the larger sample are similar enough to the results presented in Table 7, we could theorize that the results of (4)-(5) would also be similar, were the data on behavioral variables available. Indeed, Table 10 demonstrates a similar pattern of estimates of the coefficient on che for a wider "population" of countries. This significantly boosts our confidence in the external validity of our study.
Second, it is possible to compare our results to those obtained by Khan et al. whose paper is plausibly the closest in terms of themes and variables discussed to this study [5, p. 347]. The authors report a surprising finding of a significant positive relationship between national expenditure on healthcare and COVID19 fatalities. This finding resembles the result of regression (1); we found the relationship to be positive in the regressions (2) and (3) as well. It is possible that the authors did not try to address the upward bias in the coefficient on current health expenditure, as it was not their main variable of interest. All in all, this may be a further indirect argument in favour of external validity.
Turning to the internal validity, we will discuss each possible threat in turn. 1. Omitted variable bias. The multiple regressions discussed above control for a wide range of country-level characteristics reflecting the availability and quality of healthcare, public attitudes and behaviors during the pandemic, geographical, demograph-
oLS Regression results (n = 160)
Variable Dependent variable:
mortality
(1) (2) (3)
che 10.475*** (2.643) 2.538 (2.207) 0.985 (2.085)
pop65 5.677*** (1.989) 4.398** (1.798)
urban 0.771* (0.425) 0.466 (0.405)
doctors 0727 (0.751) 0.237 (0.737)
nurses 0.487* (0.252) 0.396* (0.239)
dghe 0.321 (0.304) 0.464 (0.306)
popdens 0.019*** (0.005) 0.008* (0.005)
region (base: Africa) Americas 97.138*** (26.916)
Eastern Mediterranean 30.730** (15.455)
Europe 77.697*** (26.378)
South-East Asia 7.412 (15.139)
Western Pacific 26.567* (14.525)
Observations 160 160 160
Note: *p <0.1; **p <0.05; ***p <0.01. Robust standard errors in parentheses
ic and economic influences. Admittedly, there still could be some variables omitted, e. g. the availability of training in medical emergency for medical personnel, quality of ambulance services, etc., in which case some omitted variable bias would remain. This bias would remain also if a country's income level fails to pick up all the information on its development (and the quality of its records). Lastly, pandemic response
by the government (e. g. the stringency of lockdown and even "che post-the start of the pandemic") may be correlated with che and affect mortality (even as they suffer from simultaneous causality). This last example, however, puts us in the territory of dealing with time series and is beyond the scope of this study.
2. Misspecification of the functional form. There is no evidence of a glaring misspecifi-cation of the functional form that we can think of (especially considering the already relative nature of many variables included in the model). Further functional form analysis could conceivably be carried out in the future.
3. Errors in variables. It is rather likely that the quality of reporting COVID-19 cases and fatalities varies across countries. One could hope, however, that there is less ambiguity in calculating the number of fatalities than in determining the number of cases, which is why mortality was chosen as the dependent variable. Another measurement error could arise in the responses to the Global Behaviors and Perceptions survey if people had opted to not tell the truth.
4. Sample selection. We can assuredly say that there was no biased sample selection on our part, as we strived to include as many countries as possible (the sample was constrained mostly by the availability of data). The most obvious source of sample selection bias would be the sampling methodology of the Global Behaviors and Perceptions survey: people who willingly participated in the online survey could have been more concerned about COVID-19 or have had better access to the Internet than the general population. Sample selection bias could also arise if the data missing from other sources were missing systemically for some countries due to conflict, lack of statistical capacity, or other nonrandom reasons. As such, we must advise caution when generalizing our findings.
5. Simultaneous causality. In principle, there should be little to no simultaneous causality, for while the data on mortality are rather recent, the data on most other variables (and specifically che) were collected before the pandemic even started.
6. Heteroskedasticity and correlation of the error term across observations. All the errors in this study are heteroskedasticity robust; the sampling, however, was not random (the data were collected for all countries, not counting other sampling issues discussed above), so there might be some degree of correlation of the regression errors across observations, especially for adjacent countries, despite our controlling for geographical influences via region.
Bringing together the suggestions outlined earlier throughout the study, further eliminating the endogeneity in the variable of interest che (and possibly obtaining sensible estimates of its effect in middle- and high-income countries) requires yet again rebuilding the dataset to augment the model with new measures of a country's institutional quality as well as the data on government response to the pandemic. This future development would also entail reconsidering the functional form of the model with respect to the newly added regressors.
5. Conclusion
This paper adds to the (rather sparse at the time of writing) body of literature studying the role of pandemic preparedness (given by the national spending on health, pre-pandemic) in alleviating some of the adverse outcomes of the COVID19 pandemic. It demonstrates that an increase in the share of current health expenditures in GDP by 1 percentage point would have decreased mortality from COVID-19 per 100,000 population by 19.426 in a low-income country in the event of a global emergency such as the pandemic in question, ceteris paribus. This effect for middle- and high-income countries is not as significant, although there are some grounds to believe that such a result is due to the imperfections of the model rather than the real state of things; as such, it
can be built upon and improved. The answer to our research question and the policy
implication combined, therefore, is that spending on health can save lives.
References / Литература
1. Elola-Somoza F. J., Bas-Villalobos M. C., P'erez-Villacast'in J., MacayaMiguel C. Public healthcare expenditure and COVID-19 mortality in Spain and in Europe. Revista Cl'inica Española (English Edition). 2021. Vol. 221. N 7. P. 400-403.
2. Fetzer T., Witte M., Hensel L., Jachimowicz J. M. et. al. Global Behaviors and Perceptions in the COVID-19 Pandemic. PsyArXiv Preprints. 2021. P. 77.
3. Gupta A. K., Nethan S. T., Mehrotra R. Tobacco use as a well-recognized cause of severe COVID-19 manifestations. Respiratory Medicine. 2020. N 176. P. 106233.
4. Kapitsinis N. The underlying factors of the COVID-19 spatially uneven spread. Initial evidence from regions in nine EU countries. Regional Science Policy Practice. 2020. Vol. 12. N 6. P. 10271045.
5. Khan J. R., Awan N., Islam M., Muurlink O. Healthcare capacity, health expenditure, and civil society as predictors of COVID-19 case fatalities: a global analysis. Frontiers in public health. 2020. N 8. P. 347.
6. Oshinubi K., Rachdi M., Demongeot J. Analysis of Reproduction Number R0 of COVID-19 Using Current Health Expenditure as Gross Domestic Product Percentage (CHE/GDP) across Countries. Multidisciplinary Digital Publishing Institute. 2021. Vol. 9. N 10. P. 1247.
7. Zheng Z., Peng F., Xu B., Zhao J. et al. Risk factors of critical mortal COVID-19 cases: A systematic literature review and meta-analysis // Journal of infection. 2020. Vol. 81. N 2. P. 16-25.
About the author:
Mariia A. ovsiannikova, student of Department of Economics, HSE Campus in St. Petersburg (Saint-Petersburg, Russian Federation); maovsyannikova_1@edu.hse.ru
Об авторе:
Овсянникова Мария Алексеевна, студент департамента экономики Национального исследовательского университета «Высшая школа экономики» — Санкт-Петербург (Санкт-Петербург, Российская Федерация); maovsyannikova_1@edu.hse.ru