Thank you for this clear — and somewhat frustrating, as by nature we prefer definitive single answers — explanation and reminder that an excess deaths number is subject to the selection of prior data range and methods. In the end, regular shmoes like me rely on analysts’ expertise and careful attention to these concerns to identify, as you say, the most plausible, most supportable. In the end, as with so much, trust plays a huge role.
Trouble is, experts have shown no more ability to analyze things like this than mere mortals. Experts can't learn things by observation either. They have to learn from experiment.
Physicist Richard Feynman was clear on this point: "science is the belief in the ignorance of experts". Statistics from observations are useful in coming up with a hypothesis, but useless in proving it.
I think Adam Kucharski represents a good rebuttal to your first statement. His knowledge and experience told him that regular mortals (journalists, social media influencers — ugh to that term — and ordinary people posting opinions on Facebook) were mistakenly flinging about unitary excess death numbers, not understanding the perils of inadequate analysis/review of the charts’ underlying selections of data and method.
Scientists and other experts do learn from observations. Not everything is amenable to experiment, unfortunately. Observational studies are useful in testing hypotheses, subject to their own particular cautions and qualifiers.
Feynman’s aphorism is, like most, oversimplified and arguably inaccurate. His point, I believe, was that scientists should not *fear* to question experts and established science in order to test a plausible hypothesis, but as with similar statements that a good scientist doubts everything and everyone, science would never progress if each individual had to begin anew every time, doubting all current accepted science.
Doubting that experts have any value to add, discounting their education, experience and expertise, is a grave mistake. This leads me to rebut Feynman’s quote with one from Isaac Asimov, who warned against thinking that “[…]my ignorance is just as good as your knowledge.” 😉
As to your last statement, statistics from observational studies absolutely are applied to test hypotheses.
Much of this is seems to me off-topic, though interesting.
Kucharski was illustrating why we should be very careful in promoting a statistical trend presented to us without questioning the data and methodology elections made by the originator, a great reminder that charts are dangerous beasts. His expertise absolutely should swing more weight in identifying weaknesses and suggesting/illustrating specific alternatives than my relative ignorance.
My comment did stray from Adam Kucharski's comment a bit, but I did want to object to the emphasis you made on expertise. Whether one is an amateur or an expert is not really important. What matters is your ideas and your data to support them. That's the point Richard Feynman was making when he spoke to the group of high school science teachers and told them that science is the belief in the ignorance of experts. Even they have to produce their data or they have nothing.
We saw that all the time during the pandemic. Tony Fauci was a major offender. He had the impudence to say that when people attacked him they were attacking science, because he represented science. Gavin Newsom was the same, saying he was following the science when he put into place silly restrictions that had no facts to support them.
We see it too even with things like Charles Darwin's theory of evolution. There's a theory which is accepted by experts in biology as gospel, but they have data or empirical tests to back that up. The idea that random mutations and natural selection could shape, for example, the trillions of synapses in the human brain is a fantastic notion.
Adam Kucharski gets into some of this in his excellent book Proof: The Art and Science of Certainty. But though he cites people like Karl Popper and Judea Pearl, I don't think he gives enough emphasis to their ideas of rigor in proof. I'm a computer scientist and a lawyer, with special interest in modeling complex adaptive systems and in the scientific method, as well as in legal standards and burdens of proof. I think we give too much credence to experts based on their credentials, rather than demanding their data.
I wrote a review of Proof as a comment to a post by Adam Kucharski, but it's lost now in the bowels of Substack. I'm not sure that anyone besides me read it. So I pop up now and then with comments on posts like this one to raise these issues and hear the views of others on them, like yours.
I guess 2019 being an outlier year could signal some recent innovation in public health, or it could be a chance fluctuation where we are likely to see a rebound in subsequent years (is there a name for a reverse harvesting effect!). The choice of baseline might depend on which seems more plausible.
I wouldn't be surprised if it is a chance fluctuation.
Long story, but I got involved in debunking a brain protection device known as the Q-collar, which, among many things, claimed to mimic the supposedly protective effects of "higher altitude" from concussions.
The research team found that in the NFL, there was a significantly reduced risk of concussion at "higher altitudes" which they defined as >600ft above sea level (not anywhere close to an elevation sufficient for physiologic effect).
But in their data set (2012 and 2013 seasons), there WAS a significant reduction in concussion incidence at these "higher altitudes" in one season. When we indepedently analyzed 8 total seasons, that effect was still only seen in one season. It was a complete anomaly; by random chance that one season had a chance relationship.
Natural variability causes all sorts of havoc (and incorrect conclusions)!
The extrapolations of mortality trends show confidence intervals, which should also have been shown for the modelled excess mortality figures. In general, the longest possible available dataset should be used for the models; three years is far too short.
Ultimately the challenge is that these methods aren’t really designed to look at excess mortality over half a decade. They were designed to answer questions like ‘is this year worse than the proceeding years?’ E.g. ‘is this an unusually bad flu year?’ So reasonable to compare 2020 to 2015-19, for example, but not 2020-25 to 2017-19.
Good point. Looking at this analysis which I need to do more carefully how can science guide society to preparing itself for future outbreaks.
It is frustrating to see how unprepared we are even at a small scale. Take these recent floods e.g. Monmouth. These impacts could be predicted, e.g. buildings in certain areas will be flooded, warnings made (take valuables upstairs) contingency plans formulated etc
Science ought to be used for adapting society for disease outbreaks, environmental changes and for protecting itself for the formulation of scientific theories to apply and for discovering objective reality.
I'm not much of a statistician. But I will say that I find the fourth chart ("Baseline: Linear Regression 2017-2019") highly amusing, because the underlying assumption implies that in 2045 mortality rates turn negative.
I think Max Brooks has written several books about this phenomenon.
I agree shouldn’t extrapolate as far as 2045 (for any model!) and as piece notes, even 2025 arguably too far. That being said, any improving trend will look like linear slope in short term. Probably the most reasonable assumption is to treat 2019 as outlier, then 2023 and 2024 land back on the trend.
I’m sure we can all believe the pull forward effect is significant when it comes to Covid. The number of deaths in aged care tell us there are many whose expected future life was quite short. In clusters they would create visible holes in actual deaths.
The holes are read as negatives and reduce excess death totals.
Perhaps Sweden has benefited from this mechanism with their high and early deaths.
Today I realised something.
If the trend line is high, some true excess deaths are missed and not counted. Later, when the time for these dead comes, they still make a hole in actual deaths and create a negative to add to their missed positive.
So, an incorrect trend line has an effect which is doubled when probable life length falls within the period of interest.
It would be very beneficial to the model to include the age distribution of the population into account as the mortality (with and withour COVID-19) differs significantly by age.
One observation I made during the pandemic is that infectious disease epidemiologists seem generally unaware of chronic disease methodologies such as age-standardisation, time-seriesmodelling, confounding and so on. There are quite sophisticated tools out there, which go well beyond simple descriptive atudies, to deal with questions like this, which are routine in, for instance,cancer epidemiology.
This is a fascinating article and a perfect example of why I love Substack.
We hear all about excess mortality rates as an indicator of success in fighting the early stages of the COVID pandemic… but this outcome is a bit fragile, highly sensitive to what the baseline is.
By choosing the “right” baseline, any conclusion can be reached.
Lies, damned lies and statistics. As a computer scientist/lawyer for decades I have found how malleable "facts" can be. Rather than arguing in the abstract, we learn most by running experiments in the real world.
That's where people like Elon Musk and Donald Trump excel. What they say is not important. What they do is. They don't argue, they try. They don't look back but forward.
To model expected deaths, would it not be best practice to fit an ARIMA model and an exponential smoothing model to the time series of actual deaths, choose the best-fit model, and use it to forecast the hypothetical future without Covid?
I think I follow this when I’m reading it, but could I explain it to someone else? No😕 - luckily I’ve always had a deep suspicion that data can be massaged to prove a point, and I’m afraid a lot of it went on with our government (UK) for sure.
Also if the aggregated mortality figures are available at the regional (or even better sub regional level) it would be interesting to consider a border strategy with Norway or other neighboring countries: use mortality right across the border as a control to estimate the causal impact of Sweden-specific factors (presumably policy). This may have been attempted already in the literature. Needless to say there’s going to be lots of researcher degrees of freedom there too.
I guess main challenge is nowhere can be the ‘no pandemic’ control… Probably the most reasonable assumption for Sweden is to treat 2019 as outlier, then 2023 and 2024 land back on the trend - could even fit to these to get better handle on 2020-21 impact.
I agree, though pandemic control measures varied from country to country with Sweden adopting a different approach to many otherwise similar countries so perhaps there's a natural experiment in there?
These are two slightly different questions - excess mortality in Sweden vs impact of transmission reducing interventions - but this paper (and one before it) uses variation across countries to estimate effect of different NPIs on transmission: https://www.nature.com/articles/s41467-021-26013-4
The problem with a simple linear diff-in-diff is that transmission is exponential (and measured outcomes often lagged and spread out over time)
Thank you for this clear — and somewhat frustrating, as by nature we prefer definitive single answers — explanation and reminder that an excess deaths number is subject to the selection of prior data range and methods. In the end, regular shmoes like me rely on analysts’ expertise and careful attention to these concerns to identify, as you say, the most plausible, most supportable. In the end, as with so much, trust plays a huge role.
Trouble is, experts have shown no more ability to analyze things like this than mere mortals. Experts can't learn things by observation either. They have to learn from experiment.
Physicist Richard Feynman was clear on this point: "science is the belief in the ignorance of experts". Statistics from observations are useful in coming up with a hypothesis, but useless in proving it.
I think Adam Kucharski represents a good rebuttal to your first statement. His knowledge and experience told him that regular mortals (journalists, social media influencers — ugh to that term — and ordinary people posting opinions on Facebook) were mistakenly flinging about unitary excess death numbers, not understanding the perils of inadequate analysis/review of the charts’ underlying selections of data and method.
Scientists and other experts do learn from observations. Not everything is amenable to experiment, unfortunately. Observational studies are useful in testing hypotheses, subject to their own particular cautions and qualifiers.
Feynman’s aphorism is, like most, oversimplified and arguably inaccurate. His point, I believe, was that scientists should not *fear* to question experts and established science in order to test a plausible hypothesis, but as with similar statements that a good scientist doubts everything and everyone, science would never progress if each individual had to begin anew every time, doubting all current accepted science.
Doubting that experts have any value to add, discounting their education, experience and expertise, is a grave mistake. This leads me to rebut Feynman’s quote with one from Isaac Asimov, who warned against thinking that “[…]my ignorance is just as good as your knowledge.” 😉
As to your last statement, statistics from observational studies absolutely are applied to test hypotheses.
Much of this is seems to me off-topic, though interesting.
Kucharski was illustrating why we should be very careful in promoting a statistical trend presented to us without questioning the data and methodology elections made by the originator, a great reminder that charts are dangerous beasts. His expertise absolutely should swing more weight in identifying weaknesses and suggesting/illustrating specific alternatives than my relative ignorance.
My comment did stray from Adam Kucharski's comment a bit, but I did want to object to the emphasis you made on expertise. Whether one is an amateur or an expert is not really important. What matters is your ideas and your data to support them. That's the point Richard Feynman was making when he spoke to the group of high school science teachers and told them that science is the belief in the ignorance of experts. Even they have to produce their data or they have nothing.
I get tired of reading things like this: "Ultra-processed foods are danger to global public health, experts warn" (https://www.msn.com/en-us/health/other/ultra-processed-foods-are-danger-to-global-public-health-experts-warn/ar-AA1QH8Ty?ocid=TobArticle) People passing off political policy positions as science.
We saw that all the time during the pandemic. Tony Fauci was a major offender. He had the impudence to say that when people attacked him they were attacking science, because he represented science. Gavin Newsom was the same, saying he was following the science when he put into place silly restrictions that had no facts to support them.
We see it too even with things like Charles Darwin's theory of evolution. There's a theory which is accepted by experts in biology as gospel, but they have data or empirical tests to back that up. The idea that random mutations and natural selection could shape, for example, the trillions of synapses in the human brain is a fantastic notion.
Adam Kucharski gets into some of this in his excellent book Proof: The Art and Science of Certainty. But though he cites people like Karl Popper and Judea Pearl, I don't think he gives enough emphasis to their ideas of rigor in proof. I'm a computer scientist and a lawyer, with special interest in modeling complex adaptive systems and in the scientific method, as well as in legal standards and burdens of proof. I think we give too much credence to experts based on their credentials, rather than demanding their data.
I wrote a review of Proof as a comment to a post by Adam Kucharski, but it's lost now in the bowels of Substack. I'm not sure that anyone besides me read it. So I pop up now and then with comments on posts like this one to raise these issues and hear the views of others on them, like yours.
Thank you for this easily understandable explanation. Hopefully will be widely read. Maybe even Bhattacharya will read it 🤷
I guess 2019 being an outlier year could signal some recent innovation in public health, or it could be a chance fluctuation where we are likely to see a rebound in subsequent years (is there a name for a reverse harvesting effect!). The choice of baseline might depend on which seems more plausible.
I wouldn't be surprised if it is a chance fluctuation.
Long story, but I got involved in debunking a brain protection device known as the Q-collar, which, among many things, claimed to mimic the supposedly protective effects of "higher altitude" from concussions.
The research team found that in the NFL, there was a significantly reduced risk of concussion at "higher altitudes" which they defined as >600ft above sea level (not anywhere close to an elevation sufficient for physiologic effect).
But in their data set (2012 and 2013 seasons), there WAS a significant reduction in concussion incidence at these "higher altitudes" in one season. When we indepedently analyzed 8 total seasons, that effect was still only seen in one season. It was a complete anomaly; by random chance that one season had a chance relationship.
Natural variability causes all sorts of havoc (and incorrect conclusions)!
Thanks for your write up on this. I can see that as we get further from 2020 it gets harder to see the "burden of CoVid" with certainty.
The extrapolations of mortality trends show confidence intervals, which should also have been shown for the modelled excess mortality figures. In general, the longest possible available dataset should be used for the models; three years is far too short.
I agree with Entropy. Oh dear! I guess we are just left with common sense.
Ultimately the challenge is that these methods aren’t really designed to look at excess mortality over half a decade. They were designed to answer questions like ‘is this year worse than the proceeding years?’ E.g. ‘is this an unusually bad flu year?’ So reasonable to compare 2020 to 2015-19, for example, but not 2020-25 to 2017-19.
Good point. Looking at this analysis which I need to do more carefully how can science guide society to preparing itself for future outbreaks.
It is frustrating to see how unprepared we are even at a small scale. Take these recent floods e.g. Monmouth. These impacts could be predicted, e.g. buildings in certain areas will be flooded, warnings made (take valuables upstairs) contingency plans formulated etc
Science ought to be used for adapting society for disease outbreaks, environmental changes and for protecting itself for the formulation of scientific theories to apply and for discovering objective reality.
Science generally is a successful activity.
I hope this point is relevant to the issue.
I'm not much of a statistician. But I will say that I find the fourth chart ("Baseline: Linear Regression 2017-2019") highly amusing, because the underlying assumption implies that in 2045 mortality rates turn negative.
I think Max Brooks has written several books about this phenomenon.
I agree shouldn’t extrapolate as far as 2045 (for any model!) and as piece notes, even 2025 arguably too far. That being said, any improving trend will look like linear slope in short term. Probably the most reasonable assumption is to treat 2019 as outlier, then 2023 and 2024 land back on the trend.
I’m sure we can all believe the pull forward effect is significant when it comes to Covid. The number of deaths in aged care tell us there are many whose expected future life was quite short. In clusters they would create visible holes in actual deaths.
The holes are read as negatives and reduce excess death totals.
Perhaps Sweden has benefited from this mechanism with their high and early deaths.
Today I realised something.
If the trend line is high, some true excess deaths are missed and not counted. Later, when the time for these dead comes, they still make a hole in actual deaths and create a negative to add to their missed positive.
So, an incorrect trend line has an effect which is doubled when probable life length falls within the period of interest.
It would be very beneficial to the model to include the age distribution of the population into account as the mortality (with and withour COVID-19) differs significantly by age.
The mortality estimates above are age standardised I think.
Apologies for typos
One observation I made during the pandemic is that infectious disease epidemiologists seem generally unaware of chronic disease methodologies such as age-standardisation, time-seriesmodelling, confounding and so on. There are quite sophisticated tools out there, which go well beyond simple descriptive atudies, to deal with questions like this, which are routine in, for instance,cancer epidemiology.
Do you have example in mind? Those methods are all commonly used in infectious disease epi
This is a fascinating article and a perfect example of why I love Substack.
We hear all about excess mortality rates as an indicator of success in fighting the early stages of the COVID pandemic… but this outcome is a bit fragile, highly sensitive to what the baseline is.
By choosing the “right” baseline, any conclusion can be reached.
Lies, damned lies and statistics. As a computer scientist/lawyer for decades I have found how malleable "facts" can be. Rather than arguing in the abstract, we learn most by running experiments in the real world.
That's where people like Elon Musk and Donald Trump excel. What they say is not important. What they do is. They don't argue, they try. They don't look back but forward.
To model expected deaths, would it not be best practice to fit an ARIMA model and an exponential smoothing model to the time series of actual deaths, choose the best-fit model, and use it to forecast the hypothetical future without Covid?
I think I follow this when I’m reading it, but could I explain it to someone else? No😕 - luckily I’ve always had a deep suspicion that data can be massaged to prove a point, and I’m afraid a lot of it went on with our government (UK) for sure.
You need to do a diff-in-diff right?
Or a synthetic control study?
Also if the aggregated mortality figures are available at the regional (or even better sub regional level) it would be interesting to consider a border strategy with Norway or other neighboring countries: use mortality right across the border as a control to estimate the causal impact of Sweden-specific factors (presumably policy). This may have been attempted already in the literature. Needless to say there’s going to be lots of researcher degrees of freedom there too.
I guess main challenge is nowhere can be the ‘no pandemic’ control… Probably the most reasonable assumption for Sweden is to treat 2019 as outlier, then 2023 and 2024 land back on the trend - could even fit to these to get better handle on 2020-21 impact.
I agree, though pandemic control measures varied from country to country with Sweden adopting a different approach to many otherwise similar countries so perhaps there's a natural experiment in there?
These are two slightly different questions - excess mortality in Sweden vs impact of transmission reducing interventions - but this paper (and one before it) uses variation across countries to estimate effect of different NPIs on transmission: https://www.nature.com/articles/s41467-021-26013-4
The problem with a simple linear diff-in-diff is that transmission is exponential (and measured outcomes often lagged and spread out over time)