Dan Garisto

The Higgs Discovery Did Take Place

Let’s get this out of the way: In 2012, two independent teams of particle physicists discovered the Higgs boson.

Why bother with the obvious? Earlier this week, Ben Recht, a computer scientist at the University of California, Berkeley wrote an article “The Higgs Discovery Did Not Take Place,” a self-consciously contrarian yet serious polemic. In it, he basically argues that the discovery as we think of it, a revelation about the way our universe works, did not happen. After some initial praise, there’s been a fair amount of pushback. Kyle Cranmer (an experimental physicist directly involved with the discovery) has a very nice rebuttal worth reading in full. I had a few thoughts as well, so I took a stab at fleshing them out.

Recht’s argument, after a breezy introduction to quantum field theory, goes something like this:

there are lots of uncertainties in a very complicated theory
there are lots of uncertainties in very complicated collider data

(A brief interlude to say that so far, this is fine. From here, things go off the rails.)

physicists searching for the Higgs engaged in questionable research practices and dubious statistics
no one understands the entire experiment
this means physicists cannot objectively probe reality
therefore the discovery of Higgs was a matter of social convention

Recht writes:

The CERN collaboration establishes parliamentary rules to decide upon scientific truth. Reality is validated by majority vote. Physicists love to talk a lot about how they are probing the very nature of the universe, but they do this by a lot of boring committee meetings.

Where to begin. First, there is no ‘CERN collaboration’ nor are there ‘parliamentary rules’ by which to establish scientific truth. CERN is the European Organization for Nuclear Research. There are multiple experiments which have detectors at CERN’s Large Hadron Collider, two of which, ATLAS and CMS, discovered the Higgs boson. There are thousands of physicists on these experiments (most of whom are based at labs and universities elsewhere around the world). Accordingly, there are decisions to be made—about funding, about logistics, about whether to publish an analysis.

Boring committee meetings are part and parcel of any large-scale scientific effort, but this detail seems to come as a surprise to Recht, whose stance is that bureaucracy is in contradiction with the ability to actually discover anything fundamental about the nature of the universe.

In the late ‘70s, as postmodernism began to hit its stride, sociologists of scientists asked: “How socially constructed is science?” Works like Laboratory Life (which I have not read), and Constructing Quarks (some of which I have read) did more than just cast doubt on the purported objectivity of science—in some cases, sociologists rejected scientists’ ability to ascertain facts about reality.

This was a bit an overreach and made a lot of people quite upset.

It was, however, a useful corrective to the myth that sciences are purely objective. There is no doubt that even the most abstract and fundamental searches in physics are affected by these human factors. Kent Staley has a very nice historical account of this in The Evidence for the Top Quark. He traces the discussions that led CDF to a particular detector design, the implementation of novel silicon vertex tech, internal debates about which tagging algorithm to use, and the raucous disagreements about the interpretation of the data. About the “struggle within CDF as physicists developed strategies for finding the top quark”, Staley writes:

Political intrigues and social forces featured prominently, but not simply by making some approaches look convincing and others not convincing, as a strictly “social constructivist” reading might suggest

Staley’s careful history shows the incompleteness of a simple narrative about the discovery for the top quark. The path to say “We establish the existence of the top quark” was fraught with human subjectivity and strongly shaped by social forces. And yet it was a discovery all the same: at the end of the day, physicists had wrested forth another elementary particle from reality.

Some speculation on my part: When simple narratives about physics are complicated in this way, people can feel understandably disillusioned. They may crow about this feeling. ‘Gotcha! Not so pure after all, huh?’ Fine, no harm done. But sometimes the reaction goes too far, becomes a recalcitrant rejection of the results—of, well, reality.

-§-

On the subject of reality, Recht writes:

Whether or not the Higgs exists has no bearing outside the insular world of particle physics. If you don’t have a four teraelectronvolt supercollider, you can’t make a Higgs. The Higgs field has no bearing on any physics at any scale anyone would ever care about. So I don’t care either way if physicists think they found a Higgs. It has zero bearing on my existence.

A brief review of the history and physics of the Higgs boson may be in order.

The year is 1963 and quarks are mathematical figments; it will be another five years before the prediction of the W and Z bosons and a decade before the Standard Model exists. Some theorists are pondering an abstruse question: can massive vector bosons exist? In the span of several months, three independent groups totalling 6 physicists arrive at roughly the same conclusion. Symmetry demands vector bosons be massless, but by clever means, symmetry can be suppressed, and the bosons made massive.

All that is needed is a symmetrical system that allows a transition to a stable (apparently) asymmetrical state. In the canonical example, a marble rests on the center of the dome of a sombrero. The marble falls randomly down the sombrero’s slope and comes to rest stably in the trough of the hat. The hat doesn’t change shape, so it remains symmetrical, but the marble has created an apparent asymmetry. (This is only apparent; there is still symmetry across all possible marble configurations in the trough.)

The mathematical equivalent of the sombrero is the addition of a scalar field to existing equations. With this field in place, vector bosons can acquire mass. This is, in fact, quite important for Recht’s continued existence. Without the Higgs, elementary particles would not have mass, massless electrons would never be confined to orbitals, and atoms could not form. (n.b. Non-vector bosons also get mass from the Higgs field, but through a slightly different mechanism. Additionally, most—99%—of the mass in, say, a proton, results from the gluons, not quark masses from the Higgs mechanism).

One of those theorists, the recently departed Peter Higgs, realized the mechanism implied the existence of an observable:

It is worth noting that an essential feature of the type of theory which has been described in this note is the prediction of incomplete multiplets of scalar and vector bosons.

It was his paper that became synonymous with the mass-giving mechanism. For the first few years there was little reason for anyone to care because gauge theories were not renormalizable. But as physicists worked out the kinks in gauge theories, the importance of Higgs’ work became undeniable. It’s easy to see this progress through citations. From 1964 to 1971, Higgs’ paper received no more than 4 citations a year. Then, in 1972—following ‘t Hooft’s work on renormalizability—there were 12. The paper received 36 in 1973, and the number hovered in the few dozens for the next thirty years. In 2006, as LHC efforts were ramping up, citations spiked to 97. The paper now receives several hundred citations each year (and there are far more studies written about or depending on the Higgs than cite the original paper).

Part of the trouble was that no one knew what mass such a scalar boson would be. A 1976 profile of the then-hypothetical boson even considered a mass under .3 GeV. As the pieces of the Standard Model fell into place, the Higgs boson became an increasingly attractive answer to the question of mass generation.

When the Superconducting Super Collider went belly up in the ‘90s, a lonely nation (of physicists) turned its eyes to Europe. CERN committed to the LHC, which had support in part because of a ‘no-lose’ theorem; something had to contribute to WW scattering, so there was something to be discovered at that energy range below about 1000 GeV. By the time 2000 rolled around, there were even bumps in the data that seemed to hint at the Higgs. LEP, the precursor to the LHC ended up with a tantalizing excess around 115 GeV.

But that something was not clear. Many physicists working at the LHC doubted the existence of the Higgs and Higgsless models remained popular alternatives until July 4, 2012. (Due to the time delay in academic publishing, some papers with Higgsless models continued to be published throughout the following year).

All this is to say that when physicists eventually gathered enough evidence to say that a Higgs-like particle was lurking at 125 GeV in LHC data, it was part of a half-century long hunt. Unless you are some sort of hardcore frequentist, this context matters.

-§-

The sticking point here seems to be that Recht is unconvinced by the statistical rigor of physicists, and accuses them of “accepting what the open science community would deride as Highly Questionable Research Practices (lots of p-hacking, HARKing, and multiple hypothesis testing).” He asserts that 5 sigma is “another mindless convention needed for a well-functioning collaboration.” It’s true 5 sigma is an arbitrary line in the sand, but it is not a random or capricious one. I’ve written about some of the origins of the 5 sigma standard and the bumpy (pun intended) ride to standardization in particle physics, if you’re curious.

Recht is particularly concerned about the fact that CERN has a statistical committee, and galled that Science would write that 5 sigma does not literally mean there is a 1 in 3 million chance the result is wrong: “Should not be interpreted literally? Are you serious?“ Here’s how I put it in an article about Muon g-2 (emphasis retroactive)

The finding has a statistical significance of 3.1 sigma, which meets the standard baseline for evidence in particle physics. Precisely speaking, 3.1 sigma means that in the absence of new physics, statistical fluctuations would still lead the researchers to see a discrepancy between electrons and muons of 15 percent or more once every 740 times they performed the experiment. Although this would seem to suggest the observed muon-electron discrepancy is almost certainly more than a mirage, the three-sigma effect, in fact, falls well short of the gold standard of discovery in particle physics: five sigma, which works out to running the experiment 3.4 million times before seeing a statistical fluke that large. (These figures are subtly but importantly different from a one-in-740 or one-in-3.4-million chance of being wrong.)

Tommaso Dorigo has a very nice blog from 2022 about how to think about these kinds of statistics in light of disappearing LHCb anomalies. The gist is this though: Where strong priors like the Standard Model are concerned, you must compare the odds of a 3 sigma result (not uncommon) with the odds of actually seeing new physics (very rare!). What should be clear is this: particle physicists have been wrestling with questions of data and statistics for decades and there is a lot of thought and rigor that goes into these analyses. That doesn’t mean you can’t find people abusing statistics, getting excited over tiny anomalies. But there are very serious deliberations about the right way to treat the data, especially over anything as important as a claimed discovery.

Recht derisively writes that the Higgs is “an ‘object’ that has been ‘observed’ at a single location on Earth,” and he issues the following tragic dismissal:

See the bump that goes outside of their green error bars? That’s the Higgs Boson. (insert shrug emoji)

Here are the ATLAS and CMS plots with their shrug-worthy bumps.

Viewed from another perspective to better visualize the statistical significance, the bumps are anything but humble:

Recht writes that, “It requires well over 6 years of graduate study to fully understand what was supposed to be seen in the most ideal experimental situation.” Perhaps a full understanding is a lot to ask, but we can try to make sense of the bumps all the same. Cribbing from Cranmer’s rebuttal:

This bump or resonance is intimately tied to what physicists mean when they say ‘particle’. … under the null hypothesis, a certain distribution should be smoothly falling, while the alternate hypothesis would have an additoinal [sic] bump corresponding to a new particle.

What happens is that proton collisions at the LHC produce a certain number of events as a function of energy. Both ATLAS and CMS found an excess—both saw about 200 more events than they should have seen, right around 125 GeV. Even more compellingly, a good portion of this excess occurred as something decaying into two photons, for which the background is very low. (n.b. There was also 3.2 sigma evidence from the combined result of two collaborations at an entirely different collider, the Tevatron using a completely different kind of collision (proton-antiproton).)

In the years since, the Higgs boson has been discovered over and over and over again as ATLAS and CMS have accumulated orders of magnitude more data. We know, for example, the boson’s mass to a precision of about .1%.

If we are visited by aliens and they demand evidence that we are a sentient species which understands the cosmos (with the traditional ultimatum of destruction should we fall short of their expectations) I might put my hopes on plots of the Higgs mass.

-§-

Recht concludes his blog by asserting that, “The Higgs Discovery is a celebration of modern bureaucracy, not a revelation about material reality.”

There is absolutely bureaucracy and democracy, as any complete sociological account of ATLAS and CMS would show. But the analysis itself (which Recht does not engage with) is deeply connected to a physical reality. Real particles hit those detectors. Real analog signals were captured and turned into electronic data. That data was filtered, processed, analyzed, and interpreted to discover a Higgs-like particle.

The discovery of the Higgs boson was a revelation about material reality enabled by modern bureaucracy. Below a certain energy scale, much of the universe is discoverable with only modest tools and techniques. Before the dominance of colliders, physicists relied on cosmic rays from the skies. From these particles—individual events—they pieced together incredible details about the nature of reality: antimatter (positrons), a second generation of particles (muons), nuclear forces (pions), symmetry violations (kaons), even evidence of a second generation of quarks.

The Earth receives a small number of ultra high energy cosmic rays—ones which, conceivably, contain evidence of a Higgs boson decaying. The backgrounds are too large; the signals too small. One could, however, imagine an alien world, floating at the distal end of a quasar’s jet, with access to a collimated stream of highly relativistic particles. Perhaps there a lone experimenter armed with little more than a bubble chamber could discover the entire Standard Model.

But on this world, on our Earth and its comparatively meager supply of high energy cosmic rays, we must resort to colliders and the bureaucracy inherent to collaborations to probe this stage of reality, to discover the Higgs boson.

The Labors of J. Robert Oppenheimer

For all of Christopher Nolan’s notorious attention to detail, Oppenheimer is a superficial biopic. During the course of the three-hour drama, we learn remarkably little about the life of its eponymous subject. There is no real mention of his childhood; his on-screen relationship with his brother is abbreviated to the point of acquaintanceship, and nearly a third of the movie is occupied with his benefactor-turned-nemesis, Lewis Strauss (Robert Downey Jr.). The film’s inquiry of J. Robert Oppenheimer, the person, rests primarily on the ability of Cillian Murphy’s haunted eyes and sharp cheekbones to sell moments of quiet torture.

Nolan’s stated ambition is to tell the story of, as he boasts, “the most important man who ever lived.” What he seems to want, however, is to retell the mythic story of Prometheus with Oppenheimer the demigod who brought nuclear fire to mortals, and for his sin, suffered. During the infamous security hearing, Murphy’s gaunt Oppenheimer is stripped down (at one point, quite literally) and the legal eagles of the government peck away at his liver.

At this mythic remove, Oppenheimer the movie uses the man as a synecdoche for how the Atomic Age came to be: the empyrean physics in his head; the strictures of military governance; and the rivers of politics, ideology, and intrigue he swam through. I’m sympathetic to Nolan’s mythopoeic ambition. For most subjects, it would be far too much, but the destructive alchemy unleashed by the first fission bombs is about as consequential as an event in human history can be.

The trouble is that mythologizing Oppenheimer leads to myopia about a question that should be central to a biopic about a scientist, a bureaucrat, and a persecuted government employee: What, exactly, was his work?

-§-

Hollywood tends to vacillate between treating scientists as helpless nerds or as helpful wizards. These negative or positive caricatures leave little room for life outside of the lab. In Oppenheimer, an ensemble of scientists with their own storied histories (Bohr, Heisenberg, Fermi etc.) dart in and out of the picture, essentially as celebrity cameos. The reveal of Einstein (Tom Conti) via a gust of wind to blow off his hat, uncovering the iconic shock of white hair is almost charming; I laughed. To his credit, Nolan’s scientists act much as civilian (i.e. non-scientist) characters do in his other films. They speak in vernacular and not just jargon-filled babble, have personal and political dramas, and emote, like, well, humans. And yet, although Nolan does not treat the scientists like inhuman wizards, the science—the work—remains wizardry out of frame.

As a premonition of his later agony over the bomb, a younger Oppenheimer is tormented by hallucinations of quantum mechanics as he carves a bright career through Europe. The visions are prophetic and abstract: incandescent particles erupt onto the dark screen; braids of light spin dizzyingly; there are vortices of fire, both cosmic creation and earthly destruction—all set to Ludwig Goransson’s plangent score. It’s a deft visual, and Nolan is clearly conscious to at least make the sciency stuff look sleek.

In his review for Scientific American, Charles Seife laments the state of science in Oppenheimer:

…the science is minimal and appears mostly in passing—with no hint of process—and is only mentioned when necessary for a future plot point … There is simply not enough setup to explain the distinction between the atomic fission weapons developed during the Manhattan Project and the thermonuclear fusion weapons … The film doesn’t make clear at all why they’re different scientifically, technically or morally.

The cost of eliding the science at the heart of the Manhattan Project is not just one of context, Seife argues, but of character:

it’s a tired old trope … they give up a piece of themselves—their relationships, their sanity, even their humanity—for their transcendent understanding. “I was tortured by visions of a hidden universe,” Oppenheimer tells the audience, as stars and abstract flashes of light representing the quantum realm flit across the screen, moments before, as a young man, he briefly functionally loses his sanity … Nolan sacrifices the hope of truly helping the audience understand a scientist as a person by instead making him otherworldly.

I think these critiques are, by-and-large, true. Even as it is reverently accurate about the silence between the light and the blast wave of the Trinity Test, the film does not concern itself with the particulars of the science Oppenheimer grappled with.

Your mileage may vary on the extent to which the portrayal of Oppenheimer as a genius tormented by the ethereal world of quantum mechanics obscures his character (I think Murphy does a fine job humanizing him), but the choice to gloss over the substance of his life’s work—including work at Los Alamos—certainly presents an obstacle to a full picture of the man.

-§-

Oppenheimer’s work as a scientist-bureaucrat does feature in several scenes during the second act of the movie: trips with Gen. Leslie Groves (Matt Damon) on cross-country trains recruiting physicists, managing the ego of his then-employee Edward Teller (Benny Safdie), and laboratory discussions (complete with the audience-friendly visual aid of marbles) about the plutonium and U-235 the Manhattan project has piled up.

And yet, as Seife points out, the film does not explore the process of the science Oppenheimer worked on. The crisis that a gun-based mechanism would not work for a plutonium bomb is omitted entirely, though it is perhaps the quintessential example of science in the Manhattan Project.

By way of illustration:

Berkeley, 1943. Physicists, including a bespectacled Emilio Segrè, crowd around a detector in a private home also used by the Music Department, measuring decays from plutonium produced by Lawrence’s cyclotron. But noise, noise, noise in the detector. Is daily cello practice next door the problem? No, the detector doesn’t work during the night either. Trial and error. A revelation! The detector needs light to work; they leave a flashlight on overnight. Finally, in late June, data suggest cyclotron-produced plutonium is stable enough.

But then, a creeping doubt: What about plutonium produced by radioactive pile? Oppenheimer summons Segrè to Los Alamos. A moving van full of electronics rumbles into the desert. At the same time, a telegram from occupied France arrives: Joliot has seen spontaneous emission of neutrons in polonium. The implication: pile-produced plutonium might not just emit alpha particles, but also neutrons, making it unstable.

Oppenheimer orders Segrè to investigate. In a cabin, deep into the remote Pajarito Canyon, Segrè and his young assistants set up shop to re-measure cyclotron-produced plutonium. The experiment is sensitive, so sensitive that they attempt to record just six events over six months. A few more blips in the detector and plutonium would be too unstable. Again, though, the cyclotron-produced plutonium appears stable.

Finally, in April 1944, pile-produced plutonium arrives. Within days, it is clearly at least five times more fissionable than the cyclotron-produced plutonium. Fermi later takes measurements on re-irradiated plutonium and confirms it: The bomb will fizzle prematurely. Information about the results is beyond classified, but it spreads like wildfire through Los Alamos. Groves fights to withhold it from the other labs. When Compton, in Chicago, finally learns of the news, he turns white as a sheet.

July 4, 1944. Oppenheimer officially delivers the news about plutonium to Los Alamos. In desperation, they consider other options: faster guns, electromagnetic removal of the highly radioactive isotope Pu-240. Neither are feasible. They will either have to find a way to detonate the plutonium quickly or all of Hanford’s work will have been for naught.

An older idea, discounted initially, soon captures everyone’s attention: rapid detonation via implosion. On July 20th, Oppenheimer gives the order to discontinue plutonium gun operations: “All possible priority should be given to the implosion program.”

Now consider the weight the implosion design would have acquired for the audience with this context: the desperation of scientists working against time, the monumental choice to reverse course, and the understanding of the scientific labor involved to make these crucial decisions. Seen in this light, when the first atomic bomb is detonated—a second, successive sunrise on the white desert sands of New Mexico—it is not just spectacular scientific wizardry, but the result of a laborious process.

Following the discovery of spontaneous fission in plutonium might seem like a technical rabbit hole for a filmmaker to go down, one that has little precedent in films about scientists, which are mostly concerned with dazzling the audience with their subject’s genius. It needn’t be that way. There are films whose very story is the process. In Spotlight, we follow journalists tracking down sources, combing through old records, and even planning how they want to frame the story (not one of individual priests, but systematic rot in the diocese). When the story finally hits print, as we watch the pre-dawn newspaper trucks begin to distribute the paper, the payoff is real. The connection between all that labor and the finished product is clear.

Unfortunately, the closest we get to process in Oppenheimer is an inquiry into a borderline-pseudoscientific yarn about the possibility that a nuclear explosion would set off a chain reaction in the atmosphere, which, of course, involves an apocryphal trip to see a wizened Einstein. The scientific work which was Oppenheimer’s life at Los Alamos and could have been a throughline for the movie is sacrificed for superficial celebrity scientist cameos.

-§-

Oppenheimer is a tantalizing film because it suggests the possibility to tell serious stories about the work of science for mainstream audiences. There is a curious scene, in the early part of the film, which functions as a part of the thread of Oppenheimer’s flirtation with communism (politically and physically, in the form of a vampish Pugh), in which the scientists—workers—in Ernest Lawrence’s (Josh Hartnett) lab are organizing under a banner that reads F.A.E.C.T. (Federation of Architects, Engineers, Chemists, and Technicians—I confess I don’t remember if the acronym is explained in the film). Lawrence, a conservative, curses: “They won’t let me bring you on the project because of this,” and Nolan smoothly segues away to Oppenheimer’s induction into the Manhattan Project.

But there’s something more here, more than just the (surprisingly charitable) depiction of Oppenheimer’s leftist bonafides. This is not Murphy flirtatiously quipping to Pugh that he’s read Marx; it’s a work event and there are dozens of other scientists in that room—why?

At the height of the Great Depression, one in four research chemists in New York City was unemployed. Even physicists, insulated more by academia than other scientific professions, felt enough strain to set up working groups to deal with homelessness and unemployment. During the 1930s, with fascism and communism on the rise, scientists too were radicalized. Some, like Oppenheimer, joined radical organizations. Not just political parties, but groups like the American Association of Scientific Workers. Partway between a trade union and a professional organization, the AASW was designed to “promote an understanding of the relationship between science and social problems”—the ills of poverty, racism, war. It—even in just its title—was a nod to something crucial: the idea of the scientist as a worker.

The AASW was a short-lived effort. Like many leftist organizations, it splintered when Stalin made alliance with Hitler. Within a year, many who had called for science as a means of peace and diplomacy were off using cutting-edge science to make weapons of war. Its brief existence was a barometer for the degree of radical thought, and a demonstration of scientists’ ability to organize, and a demonstration that their politics were not separate from their science; the two were entwined. If scientific work was labor, was even a well-heeled professor a worker too? And if one was a worker, to whom did one’s allegiances lie—a capitalist democracy?

In a 2002 oral history, G. Rossi Lomanitz (played by Josh Zuckerman in the film) was asked how Lawrence saw the unionization effort of FAECT:

“After the war, I went to his office, and he had a talk with me in which he seemed to be rather fatherly in his attitude. He said, ‘Look, it’s really important. Don’t consider yourself a scientific worker. Consider yourself a scientist.’ I said, ‘Thank you for the advice.’”

The corollary to the conceit that science is a process is that science is work, the scientist a worker, not a wizard. In Nolan’s cosmology, there is no connection between work, science, and ideology. Oppenheimer the employee exists separately from Oppenheimer the scientist, and Oppenheimer the red. He is merely political because of a quirk in his personality, an idiosyncrasy more related to his taste in women than his work. He is a scientist not because of the work he does but because that is simply who his character is.

Compartmentalized like this, the film loses its ability to say something important. It cannot explain why Oppenheimer had the dalliances he did with leftist thought; they are inexplicable vices of an otherwise devoted public servant. Likewise, it cannot mount a convincing explanation for why a government would so mistreat its Prometheus, and must rely on Strauss as the villain. Finally, it must distort the historical record. Only Oppenheimer can have lonely hallucinations of ashen corpses and future nuclear horrors to come; his colleagues must remain largely oblivious and insensate because in Nolan’s portrayal, they did not share in the scientific work to create the weapons.

There is an argument that Murphy’s Oppenheimer implicitly realizes what Nolan has missed. As Fran Hoepfner puts it in her review for Bright Room/Dark Wall:

It is not through these endless hearings that Oppenheimer gains perspective on himself; he rather gains perspective on the people for whom he worked. He was not a genius wunderkind acting alone. He was an employee.

This reading suggests not a Promethean tragedy, but a quotidian one. Oppenheimer wasn’t punished by the gods for stealing their nuclear fire; he was punished because he was a worker at the mercy of his employer.

Science as a live sporting event

The clearest précis of the room-temperature superconductivity ruckus about LK-99 (no, not that other one)—a viral, meme-laden extravaganza of semi-scientific speculation run by revelers with the patience and rigor of highly-caffeinated three-year-olds—is that it is science as a live sporting event.

For proponents, the drama is the draw. Many are veterans of hype, enthusiasts of crypto and blockchain and quantum computing, etc.—a sort of generically tech-interested crowd capable of moving from one fad to the next. The ups and downs of early replication attempts, rumors about authorial squabbles, harebrained hypotheses (can I interest you in one about type-III non-Meissner superconductivity?): All of this is part of the great rollercoaster ride, a blow-by-blow without pay-per-view (unless you have twitter blue, which many do). It is by design, and it is exciting. This is science as a live sporting event. Many are, quite literally, gambling on its odds.

Naturally, efforts to suppress the hype around LK-99 have been received about as well as a red card on the home team’s favorite player. Why blow the whistle when there is so much good that comes out of this? Hype, so the argument goes, brings a quicker confirmation and public attention to an often underappreciated branch of physics. For many onlookers, it is a bit of real exposure to physics they would not otherwise have had (a far better use of time than the latest UFO hearings in Congress—that I do not dispute). Why ruin the fun with gatekeeping? “It’s all supposed to be fun, this whole deal was quite fun.” What could be more fun than the first room-temperature superconductor replication by a Russian catgirl?

The problem with hype is that it is not harmless fun. Hype distorts the research landscape, leading substandard research to receive disproportionate attention while important results languish. Often, hype can result in eventual blowback when its promises fail to materialize, damaging trust in the scientific community. Areas of research plagued by hype may become politically toxic for researchers to get funding and work in. On a fundamental level, if goal of science is (very reductively) an attempt to systematically apply rigor to objectively understand the world, hype is opposed to that. Hype is a subjective lens applied to science; it distorts our view and damages the accuracy of our understanding. But that funhouse view is, well, fun.

This is what I think people are mad about when the hype is interrupted or called out. This room temperature superconductivity claim is their shiny new thing, and they resent anyone who would take that fun away from them. Curiously, that has not meant an aversion to criticism of the claim. In fact, some of the most attention-grabbing, lauded tweets were those that pointed out flaws or problems with the paper. When I posted about an issue with the data that Doug Natelson and I spotted, I naïvely thought it might temper some excitement and reduce the fervid attention. It became, instead, the latest clue in a great detective hunt for the ever-more eager spectators.

I should have known. Few good matches are drubbings. A good match has twists and turns, setbacks and triumphant victories, all of which should be entertaining enough to have a pint or three over. But the things which make for an electrifying match—conflict, drama, a clean narrative—are anathema to scientific needs of prudence and patience. To disentangle complex and uncertain results, professional scientists, consciously or not, tend to exhibit these traits. (When news of the persistent muon g−2 anomaly came out in 2021, particle physicists popularized a daring publicity campaign: #cautiouslyexcited.) Consequently, most researchers are unprepared for this kind of spectacle and would rather simply avoid drama. Those willing to comment publicly tend to do so with the kind of judiciousness that creates an asymmetry: The furthest detractors will go is a measured skepticism. Non-scientist proponents have no such restraint; some have taken to writing superconductivity fanfiction.

One of the biggest difficulties in reporting on this is being able to speak bluntly about the bullshit which is driving so much of the attention. Scientists and the staid publications that cover science do not have the sardonic sense to tackle this kind of crap head-on or the dexterity to follow this kind of live and evolving story. There is no Gawker for science. There is no “live” science news team in the world, much less one for physics.

As a result, the action for this is on Twitter. It is on pubpeer, and in the comments on reddit. It is not in newspapers or magazines.

Now, after this somewhat unforgiving, polemical intro, I’m going to say two things that may surprise you:

Science journalism could learn a thing or two from the dynamism and speed of this discourse
Science journalism has long treated science as a sporting event

The peddling of science as entertainment is not new to science journalism; there is no high horse here. Since its inception in 19th century periodicals, publishing patents of electrical gizmos and updates on theories about the aether, to contemporary reportage of quantum computers and dark matter, science journalism has always depended on its entertainment value.

It’s true that there are essential things—straight news—that the public should know about science. They should know about the climate as it warms; they should know whether the water they drink is clean; they should know about fraudulent studies of Alzheimer’s disease. Likewise, science journalism is often news-news for its practitioners (especially researchers in adjacent fields).

But what should the public know about our universe and our study of some of its most esoteric and abstruse properties? I would certainly like the public to know what a superconductor is, how electrons behave so oddly under conditions that they are, all of them, somehow jolted into a macroscopic quantum state. It is good to know such things about our universe. But it is often a luxury. It is entertainment to know that quantum mechanics holds the answers to a problem Euler dreamt up, not news. It is not the kind of knowledge that will help on the day-to-day, or form the basis for a worldview.

Seen from its material basis, science journalism is a form of entertainment. Publications must maintain subscribers and must attract the clicks of curious readers for ad revenue. All of this conspires to push cool, “gee-whiz” stories to the forefront. It’s difficult to even imagine what a constituency for hard, just-the-facts science news would look like. Unlike politics or business, science is highly specialized and the familiar readership is much, much smaller. What we have instead is science journalism for model-plane enthusiasts: that is, for interested amateurs.

There’s much more to be said about the kind of perverse incentives this creates in reportage (outright ignoring some important scientific results, staying beholden to single-study reporting and the journal publication cycle, etc.) but right now I want to focus on what this means for the superconductivity drama.

The question is therefore not whether science journalism should debase itself by seeking to entertain, but how it should entertain responsibly. Because if science journalists do not do so, it’s clear that others will entertain irresponsibly. As I write this, at about 2:30 in the morning, people have seized upon one of three new preprints to confidently proclaim: “WE ARE OFFICIALLY BACK.”

The news about LK-99 has made it far outside the typical circles reserved for musing about superconductivity, and the onus is on scientists and journalists to engage, to get in the arena now and play the game. If they don’t blow the whistle for caution, who will?

Note: Added a graf about hype because in my sleep-deprived state I hadn’t really explained why it is a priori harmful to science.

What we talk about when we talk about qubits

Ed. note: This story was originally commissioned by Scientific American. After a difference in editorial vision, we came to an agreement and decided to kill the piece. I think there are still some valuable conversations to be had about the topic, so I’m posting a mostly-unchanged version from the second round of edits. The title, for those unfamiliar, is a riff on Raymond Carver’s What We Talk About When We Talk About Love.

Update (2/17/23): A preprint by Kobrin, Schuster, and Yao finds that three key signals claimed as evidence by Jafferis et al. are mirages. In particular, the signals for thermalization and teleportation* match the SYK model only because Jafferis et al. take the average over their data—a fact they do not clarify in the Nature paper. At the level of individual functions, their signal is not in agreement with the SYK model. I went into more detail here, but the upshot is that the case for Jafferis et al. being deliberately misleading is now even stronger.

Correction: The signal for teleportation in Jafferis et al. is not an average from the data. Clarification here.

Alice is talking. She is a physicist, and so Bob, a computer scientist, listens. “Do you remember my neighbor—you know, the one who’s always asking about quantum mechanics,” she asks. Bob nods over the phone, even though she can’t see him, and Alice continues. “Well, the other day he brought up all this—” she gestures inarticulately “—stuff about wormholes and quantum computers. I tried to explain, but I got stuck just describing a qubit.” Bob finishes his drink and shakes his head. “You didn’t want to trot out the standard line and tell him it’s both 0 and 1,” he asks. Alice bites her lip. “No. No, I thought I’d tell him it’s neither 0 nor 1 before we measure it,” she says. “State’s not definite, right? So we shouldn’t say it is both.” Bob scratches his chin, following the logic but not necessarily its conclusion. “Sure, but then you haven’t answered the question of how to plainly say what it is, only what it isn’t,” he prods. Alice frowns. “But I can’t very well tell him the qubit’s a linear combination of complex vectors in a two-dimensional Hilbert space.”

Here, Alice and Bob, perennial participants of thought experiments, demonstrate a typical challenge in talking about quantum mechanics—its rejection of the familiar logic of possibilities. Finding plain, accurate language requires navigating a minefield of quantum conundrums. Unsurprisingly, as knowledge of quantum computing has spilled out of academic journals and labs to a curious public, miscommunications have abounded about the nature of this strange new tech.

The latest quantum drama began late in November with a heavily discussed Nature article, tantalizingly entitled “Traversable wormhole dynamics on a quantum processor.” How could research so redolent with science fiction potential—legitimated by prestigious institutions including Google and Caltech—not lead to immediate attention?

Quanta declared, in a 4,000-word article: “Physicists Create a Wormhole Using a Quantum Computer.” (The headline was subsequently changed to clarify the wormhole was “holographic.”) The New York Times also stated that the scientists made a wormhole, but qualified the study as “Physicists Create ‘the Smallest, Crummiest Wormhole You Can Imagine’.” Subsequent pushback led to headlines such as “No, physicists didn’t make a real wormhole. What they did was still pretty cool.” The “wormhole” study quickly became the talk of physics twitter, and frequently came up, in this reporter’s experience, at the Q2B conference in Santa Clara.

To be clear: outside of very speculative mathematics, there is no evidence whatsoever that wormholes actually exist in our universe. Even if they did, current theories of gravity strongly suggest such spacetime portals are unlikely to ever be stable enough to be traversable.

I should tell you, when my editor at Scientific American sent me the embargoed press release on Thanksgiving with the subject line “Hmmm,” I responded dubiously. “They used 9 qubits! What the hell could that possibly tell you?” We chose not to cover it—initially.

That skepticism was shared by many expert onlookers. Harvard theoretical physicist Matt Strassler ran through a list of objections on his blog: “Did physicists create a wormhole in a lab? No. Did physicists create a baby wormhole in a lab? No. Did physicists manage to study quantum gravity in a lab? No. Did physicists simulate a wormhole in a lab? No.” Peter Woit, a mathematician at Columbia University accused the authors of organizing a “publicity stunt.”

In a field deluged by claims of breakthroughs and electrified by the promise that esoteric advances in physics can eventually be converted to profit, the clamor about a putative “wormhole” on a quantum computer is not a shock. Rather, it’s a representative—albeit extreme—case of the trouble with talking about quantum computing.

As one of Raymond Carver’s discursive, slightly inebriated characters might have put it, “I was going to tell you about something. I mean, I was going to prove a point. You see, this happened a few months ago, but it’s still going on right now, and it ought to make us feel ashamed when we talk like we know what we’re talking about when we talk about [quantum computers].”

The trouble begins on a fundamental level: We do not know how things happen in the quantum realm. We can describe the results of measurements, but the ultimate nature of reality remains out of grasp and up for grabs because of this uncertainty.

An adherent of the many worlds interpretation, which asserts that every possible outcome of any quantum measurement unfolds across infinite parallel universes, might say that a calculation in a quantum computer involves the interference of wavefunctions throughout all those universes. Whereas a proponent of the traditional Copenhagen interpretation—which is agnostic about what can be known prior to measurement—might simply shrug. Because quantum mechanics is without a clear mechanism, our best descriptions of “quantumness” are by necessity riddled with equivocations and half-truths.

Consider that even a colloquial term like “computer” presents difficulties. The most rudimentary classical computers inside “smart” refrigerators and coffee machines can store memory and carry out—without error—lengthy instructions via silicon-based transistors. But the analogous parts of a quantum computer, which has qubits instead of bits, quantum logic gates, and uses quantum algorithms, only offer superficial similarities that conceal vast differences in architecture and ability.

Unlike their classical counterparts, quantum computers store information for a fraction of a second with ephemeral pairs of electrons, or ions suspended in magnetic traps, or even light bouncing between mirrors. Moreover, to protect the sensitive quantum states of qubits from computation-scuttling thermal interference and other environmental errors, most devices have to be chilled with liquid helium inside fridges a few degrees above absolute zero. At this stage in their development, quantum computers rely on tangles of wires and a bulky copper anatomy that makes them look like futuristic chandeliers rather than anything most people would recognize as a computer.

Appearances aside, quantum computers do indeed compute, but not as your computer does. They do not send email or stream video or store documents. If anything, they are more like the bespoke apparatuses of the 19th century, such as Charles Babbage’s Difference Engine. A remarkable feat for its time, Babbage’s computer used thousands of hand-cranked mechanical gears to do one thing: calculate basic mathematical functions. A “quantum difference engine” is a more accurate description of what today’s quantum computers are capable of, but even that requires a caveat—Babbage’s machine would far outstrip them at addition or subtraction.

Even without the intention of hype, it is extremely easy to perpetuate misconceptions about quantum computing. Researchers who command quantum technology thus almost inevitably are seen as techno-wizards summoning eldritch forces from some spooky realm. For many readers, invoking “quantum” is not so different from invoking magic; more D&D than R&D.

What to do? One possibility, proposed by IBM researcher Charles Bennett, is to reverse the typical thought pattern, and think about the classical world in quantum terms because the world is fundamentally quantum, and classical reality is just an approximation. Consider the inverse, Bennett asks. “A classical bit is a qubit which can only take the values of 0 or 1,” he explained during a talk at IBM. From this perspective, it’s not quite so eerie.

Quantum message discipline is sorely needed. Clarifying their headline change, Quanta noted that “The researchers behind the new work — some of the best-respected physicists in the world — frequently and consistently described their work to us as ‘creating a wormhole.’”

When reached for comment, the study’s co-leader Maria Spiropulu pointed to a Caltech FAQ in her team’s defense. “Did we claim to have produced or observed a wormhole deformation of 3+1-dimensional spacetime?” the FAQ asks. The answer it then offers is a firm “No.” The FAQ further elaborates that what the researchers saw was only “consistent with the dynamics of a traversable wormhole.” The Nature paper is less absolute. There, Spiropulu and her co-authors describe their research as “the experimental realization of traversable wormhole dynamics on nine qubits” and discuss how they used a quantum teleportation protocol to “insert another qubit across the wormhole from right to left.”

And in a video produced ahead of publication by Quanta, several researchers spoke gushingly, saying, “This is a wormhole” and “I think we have a wormhole, guys.” This is at best, deeply misleading.

What the researchers actually did was use nine qubits to compute a crude version of the Sachdev–Ye–Kitaev (SYK) model, which is used to calculate the dynamics of some quantum systems, like graphene. Simulating the full SYK model required too many qubits, so the researchers substituted a barebones, simplified version that would fit on just nine qubits. The SYK model is interesting to theorists because it is thought to be mathematically equivalent to descriptions of two-dimensional gravity in a universe with the opposite curvature of space than our own. This equivalence, theorists hope, may mean the SYK model has something to tell us about quantum gravity in our own universe. But the fact remains that the supposed bombshell dropped by the Nature paper is a calculation of an approximation of a model conjectured to be equivalent to a lower-dimensional gravity for a universe that isn’t ours.

None of which is to say the study is without value. John Preskill, a widely-respected theorist at Caltech uninvolved with the work, tweeted that, “Because only 9 qubits are involved, this same computation can be done easily (and more accurately) with a conventional computer. Nevertheless, it’s interesting and rather unexpected that such a simple model can capture some novel features of quantum gravity.”

“The physical world is quantum mechanical,” Richard Feynman wrote in 1981, “and therefore the proper problem is the simulation of quantum physics.” But exactly simulating an atom with just a few dozen electrons is impossible for classical computers because every additional electron exponentially increases the computing requirement. This restriction does not apply to a quantum computer because it follows the same laws as the quantum systems may simulate. The distinction between simulation, experiment, and computation becomes treacherously blurry, especially when one’s “computer” consists of a bunch of ions tasked with calculating the behavior of ions.

Last year, the mathematician Richard Borcherds responded to this quandary with a deceptively simple challenge for the most devout acolytes of quantum computing: Do better than his teapot.

Borcherds’ snarky argument is that dropping a teapot and calculating how it shatters is ultimately a quantum problem because the teapot is made of quantum bits and bobs like protons and electrons. Teapot shattering is not abundantly useful, but no worse than the contrived problems various research teams have “solved” with their quantum computers to claim quantum advantage. Because today’s quantum computers have no more than a few hundred qubits, they’re unable to simulate the teapot—yet the teapot, by virtue of being itself, is extremely good at simulating the “quantum” problem of its shattering.

Borcherds’ “teapot challenge” bluntly asks a question that strikes at the heart of the wormhole drama and much of today’s quantum “breakthroughs”: What precisely are today’s quantum computers doing? The authors of the Nature study call it a “quantum experiment” because they “measured observables of the physical system”—that is, they measured the qubits of the quantum computer.

Natalie Wolchover, a Pulitzer Prize-winning science writer at Quanta, argues that when a quantum computer simulates a toy model of quantum matter, such as the SYK model, it is really “creating” the quantum system it asks about. “It’s profound but somehow I can’t put my finger on what it means about the difference between ‘real’ and ‘simulated,’” she wrote in an email. (Full disclosure: Wolchover has been my editor at Quanta.)

No consensus on that distinction yet exists within the quantum computing community, but last year a trio (two physicists and a philosopher, respectively)—Dominik Hangleiter, Jacques Carolan, and Karim Thébault—published a book suggesting a potential framework for thinking about the question. “The question is, what kinds of inferences can you draw?” says Hangleiter, a quantum researcher at the University of Maryland. “The term ‘simulation’ sort of implies that you have some other system in mind.” Under this framework, there are important differences to distinguish between two kinds of simulation: emulation and computation.

Emulations aim to simulate a physical system in nature, like a water droplet, while computations attempt to simulate theories that describe nature, like fluid dynamics equations. Both simulations can be “validated” or “unvalidated” depending on what we can learn from them. Validated simulations tell us directly how the world actually is, whereas unvalidated simulations rely on analogies that merely tell us how the world might be. For instance, although bouncing oil droplets mimic certain quantum phenomena, they cannot tell us how the quantum world actually works. But they can tell us how raindrops actually bounce..

Under this framework, Spiropulu and her coauthors’ use of nine qubits to study a scaled-down approximation of the SYK model is an “unvalidated computation”. In the Quanta video, the researchers claimed their research was “the first sign that you could see gravity on a quantum computer.” To explain what they saw in the qubits, they claimed the appearance of a wormhole. “It’s not clear to me how either of those claims is substantiated,” Hangleiter says.

Even if we lived in a different universe, in which the SYK model was equivalent to a description of gravity, this research still would not be evidence for the existence of wormholes. At best, it would be consistent with wormholes; how nature could be, not how it is.

Ironically, a more prudent description of the Nature research might have been “Teleportation on a quantum processor.” Unlike wormholes, quantum teleportation is a well-established physical phenomenon which the researchers used in their experiment to ferry quantum information across qubits.

Suppose Alice (Remember Alice? It’s a story about Alice) is in San Francisco and she wants to send the state of a qubit to Bob in New York. So she asks Charlie in St. Louis to send them a pair of entangled qubits. Then, calling Bob on the telephone, she gives him instructions so he can properly measure his qubit. When he does this, the state of Alice’s qubit is teleported to Bob.

“You know, Bob,” she begins over the phone. “We never did figure out how to tell my neighbor about qubit states. How in the world are we going to tell him we just performed a quantum teleportation protocol?”

Bob presses the phone between his ear and shoulder as he prepares dinner. “Well, it’s not what he’s thinking. It’s just transporting information, and it needs a classical channel, so there’s no faster-than-light communication,” he muses over a bubbling pot of pasta. “How about we tell him there were ‘no wormholes necessary’?”

Manufacturing wonder

There are, in the end, only two types of scientific results worth reporting: those that have self-evident value/importance/beauty/wonder, and those that do not.

The vast, vast majority of scientific research worth covering is not self-evident.

So it falls on the writer to grab the reader and interpret graphs and jargon which the study’s significance hides within. Often that means an enumeration of the tantalizing possibilities—a new theory of nature, an application for combatting climate change, a seismic shift in how we understand ourselves in the universe, etc. In an oversaturated attention economy, it is nearly impossible to win by simply reporting the result.

Another way of putting this is that the task is to manufacture wonder.

Sometimes this is relatively easy; the scientists themselves are excited, the result is a good one and merely needs to be explained lucidly. Other times it’s a stretch. The reader has to be cajoled and enticed with juicy details of the “gee whiz” variety. For the talented manufacturer, almost no result is too mundane or too insignificant. A result about, say, a spike in a graph of iron levels for seafloor sediment can be transformed into a tapestry of the Triassic Age that depicts dinosaurs who (somehow) benefitted from increased levels of oxidation. New theories about dark photons (speculative) coupling to a composite Higgs boson (also speculative) which together would solve 5% of the missing mass of the universe can receive front page billing if the writer is a clever enough wordsmith.

None of this is to say that these stories are wrong or false. Many have the required caveats, (“the new theory hasn’t been proven”) but they are written in the key of excitement (“…but if they were, they would be foundation rattling”).

Manufacturing wonder is also necessary from a financial standpoint. Readers want “cool” things, publications want readers, writers want to get paid. It is difficult to get paid when you advertise that what you are writing is not wondrous, but just news. Any old news, like a town hall, like the weather, like a crime blotter.

Astronomy is in the privileged position of having much more direct, visual access to wonder than nearly any other physical science. The new photos from JWST are staggering. They are, in the archaic sense of the word, awful. Gaze upon the cliffs of the Carina Nebula. Stare upon the galactic dervish in Stephan’s Quintet. Witness the light from galaxies born before the Earth.

Words may help, but the wonder is pre-packaged, delivered from the photons to the telescope to the processing software to a press conference to your computer to your eyeballs to your brain.

Anyhow, thank goodness it’s not all this straightforward, or else I’d be out of a job.

Volatile ignorance

I’ve been trying to better familiarize myself with Marxist history/philosophy of science. Explicitly for part of a long-term project; honestly, because I’m curious about it and dislike feeling ignorant. To that end, today I skimmed some of Nikolai Bukharin’s introduction to Science at the Cross Roads (a collection of papers submitted to a 1931 conference) and read most of Boris Hessen’s (apparently) famous paper, “The Socio-Economic Roots of Newton’s Principia.” I want to write a little about my reaction to the paper, a response to the paper, and a response to the response.

Sparing you the details, which are heavily accented in the argot of 1930’s era Soviet dialectical materialism, Hessen’s point is this: Foolish capitalist histories of Newton that can only conceive of the individual rely on the “great man” theory of history and are stuck reciting facile facts—e.g. Newton, as we well know, was born the year that Galileo died. [I was surprised to learn this is not even true; Galileo died 8 January 1642 while Newton was born 4 January 1643. The confusion seems to have been borne out of the switch between Julian and Georgian calendars during Newton’s lifetime, and a copious degree of wish-fulfillment.] Compare with a proper Marxist view of scientific history: “the development of theories and the individual work of a scientist are affected by various superstructures such as political forms of class war and the results, the reflection of these wars on the minds of the participants, political, juridical, philosophic theories, religious beliefs and their subsequent development into dogmatic systems.”

Vulgar myth (persistent to this day!) still places Newton as a thinker apart from the world, inspired only by Nature in the form of a falling apple. Hessen alleges that “[d]espite the abstract mathematical character of exposition adopted in the ‘Principia’ Newton was not only not a learned scholastic divorced from life, but in the full sense of the word was in the centre of the physical and technical problems and interests of his time.” Hessen demonstrates this by contextualizing Newton among the ‘shoulders of giants’ Newton credited, specifically how these other researchers responded to technical problems of the day—questions of ballistics and geography and hydrostatics. The Principia, Hessen claims, was also influenced by Newton’s background and his relationship to the politics and economics of 17th century England. It is shocking now, as it must have been then, to read the withering assessment that “Newton was the typical member of the rising bourgeoisie and in his philosophy he embodies the characteristic features of his class.” No matter their backgrounds, ‘great men’ of science are rarely so ruthlessly reduced to an average constituent of an economic class. Richard Westfall, perhaps the foremost 20th century Newton biographer, and a later critic of Hessen, wrote this in the preface of Never At Rest (which I have not read):

The more I have studied him, the more Newton has receded from me. It has been my privilege at various times to know a number of brilliant men, men whom I acknowledge without hesitation to be my intellectual superiors. I have never, however, met one against whom I was unwilling to measure myself, so that it seemed reasonable to say that I was half as able as the person in question, or a third or a fourth, but in every case a finite fraction. The end result of my study of Newton has served to convince me that with him there is no measure. He has become for me wholly other, one of the tiny handful of supreme geniuses who have shaped the categories of the human intellect, a man not finally reducible to the criteria by which we comprehend our fellow beings.

There is a grandeur in Westfall’s view of Newton, and as with so much, I really should read his work. But Hessen’s paper excited me. The idea that science is, to some extent, socially contingent (on funding, on ideology, on the material of textbooks, etc.) is far from surprising, but reading Hessen’s argument—which brings in not one or two, but a host of social factors, all explicitly coded with an ideological lens—was new to me.

But I am not a Marxist. I also try to be suspicious of my own uninformed, initial reactions, especially when they’re positive, so I sought out some responses. I stumbled across Loren Graham’s rather meta paper “The socio-political roots of Boris Hessen: Soviet Marxism and the history of science.” It’s always reassuring when you have an idle critical thought (e.g. ‘What if someone subjected Hessen’s paper to the same critique he made of Newton?’) only to find out that it was a reasonable line of inquiry which had been pursued. I have actually previously read something from Graham (Lysenko’s Ghost), and knew him as a reputable scholar of Soviet science. By subjecting Hessen to the same epistemic procedures he used on Newton, Graham unveiled an oddity: far from being the consummate Marxist, Hessen was actually at odds with many back at home in the USSR who deplored Einstein and quantum mechanics. At the time, both relativity and QM were viewed as threats by many radical Soviet physicists and academics who objected both to the substance (QM’s probabilism was distressing to committed determinists) and origins (capitalist, anti-materialist thinkers) of the ideas. For his defenses of Einstein’s bourgeois theory and other ideological crimes, Hessen was apparently in hot water with Soviet authorities. He had also seemingly, in 1927, written an “internalist” historiography of science which science as an “abstract intellectual enterprise insulated from social, political, and economic circumstances”. But his 1931 paper took an “externalist” view, in which “social, political, and economic circumstances affect the pursuit of knowledge of Nature” to explain Newton. Why would his views have changed so radically in four years?

Graham argues Hessen was carefully trying to thread the needle by both putting him in good standing as a true Marxist and simultaneously defending Einstein. If, Graham speculated, Hessen could demonstrate, with externalist analysis, that the ideological commitments of a thinker like Newton were separate from the substance of the (unquestionably valuable) theory, he could extend that same analysis to Einstein. There’s more to say here, but the bottom line is that I was fascinated by the subversion of my understanding of Hessen’s paper. No longer was he simply an ideologue with a novel (to me) perspective; he was, in fact, in danger because of those more extreme figures! Just a few years after the conference, like several of his fellow Soviet attendees including Bukharin, Hessen was executed by firing squad after a quiet sham trial.

Relatively confident in my bettered understanding, I looked for more modern scholarship and found Sean Winkler’s 2020 article “The Materialist Dialectic in Boris Hessen’s Newton Papers (1927 and 1931).” Winkler puts a finer point on one of Graham’s contentions: Why did Hessen seemingly contradict himself, writing against externalism in 1927, but for it in 1931? Winkler summarizes Graham’ argument thusly:

Graham maintains that in both texts, ‘Hessen wished to differentiate between the social origins of science and its cognitive value.’ That is, Hessen espouses an internalist approach in both works – in an explicit manner in 1927 and in an implicit manner in 1931 –, remaining committed through and through to the defence of the natural sciences from ‘ideological perversion’.

Winkler goes on to say that Graham is essentially wrong to say that Hessen advances two different ideological claims. Both papers, Winkler claims, are facets of a unified “dialetical materialist approach that accounts for the ‘unity in opposition’ of the external and internal dimensions of natural-scientific theory.” To bolster his claim, Winkler points to a recently discovered talk from 1930 in which Hessen explicitly makes externalist claims about science. Additionally, he points out that the differing audiences of the 1927 and 1931 papers, respectively, other Marxists and wide-eyed capitalists, explain not just stylistic but ideological differences. In the former Hessen was nitpicking and challenging more extreme colleagues; in the latter he was proselytizing.

Because we are being honest here about ignorance, I will fully admit I don’t really grasp the purported unification here, have not read the 1927 paper, and skimmed much of Winkler’s. But! I’ve been forced to update my perhaps too-rosy view of Graham’s assessment, and, am at this point, suspicious that a shoe will drop on Winkler’s own paper. I believe that’s how the dialectic materializes, да?

Anyhow, I wrote this because I wanted to capture what I felt today, which I’ve felt at other times but failed to commit to words and thus memory. My opinion swung back and forth because I lacked a solid base of knowledge and I was unduly persuaded—despite my better efforts to keep some sort of intellectual equanimity! Moreover, I think this volatility is probably an inevitable part of any learning experience where one dives deep enough to encounter serious scholarly debate. I’m sure others have talked about this difficulty, but I haven’t seen it, and I wanted to share my own experience being led a bit by the nose. Like so many aspects of intellectual work that are embarrassing and therefore not discussed (e.g. writing emails), I wanted to air my own reasonably soiled laundry.

Calculating COVID-19

How researchers are using math to predict and understand a global pandemic

Note: I researched, interviewed, and wrote the bulk of this in mid-March 2020. Unable to get it published, overwhelmed by the pandemic and the need to finish other work, I left it sitting, untouched from March 22, 2020 until about last week. I’ve fleshed out parts of the draft that were incomplete (relying solely on my notes from the time) but tried to avoid injecting future knowledge, so as to preserve it in some kind of digital amber. It is my hope that the reader finds this a useful addition to the record of epidemiological knowledge of that time.

Forty-eight confirmed cases of coronavirus in the entire world existed on January 17, 2020. The bulk—45 cases—were in Wuhan, one was in Japan, and two were in Thailand. What was there to glean from a few dozen cases inside China and three lonely data points in nearby countries? For epidemiologists, that meager information was more than enough.

Using the three cases abroad and travel data from Wuhan, two teams of researchers worked backwards to estimate the size of the epidemic. They calculated likely infections in Wuhan based on the odds of exporting three cases to Japan and Thailand. Conservative estimates for Wuhan using the method predicted 1,250 and 1,700 infections, respectively. Worst-case scenarios ranged from 4,000 to 5,000. By the time testing in Wuhan ramped up at the end of January, the estimates proved prescient as thousands of people were discovered to have been infected.

In the two months that have passed, COVID-19 has become a full-blown pandemic. Countries like Italy and Iran have buckled under the strain of thousands of cases; others, like Singapore, South Korea, and Taiwan seem to have wrestled the virus into an uneasy submission. As of this writing, more than 180 countries around the world have confirmed cases.

Armed with data and a bevy of epidemiological approaches, scientists around the world have worked at a frenetic pace, publishing hundreds of papers about COVID-19. The research ranges from quantifying basic characteristics of the virus to creating complex models that can predict the impact of specific interventions, like school closures.

Their findings, typically reserved for dusty journals and rarely downloaded pdfs, are now widely discussed on platforms like Twitter. Once obscure epidemiological terms like R₀ now make headlines. Epidemiological research shapes how we understand the outbreak and guides the moves of policymakers around the world.

Like everything else, misinformation about the epidemiology of COVID-19 is rampant. Some non-experts perform their own data analyses and self-publish on sites like Medium where they garner millions of views; others simply turn to the op-ed pages of the New York Times. Claims that 40, or 70 percent of the world will be infected are tossed out with little to no context.

But the research isn’t incomprehensible. It’s more than possible to get a clear grasp on what epidemiologists know about COVID-19 so far and how they learned it. Even calculations they make and the math they use can be made lucid.

Compared to the obvious role of drugs and medical techniques, math is an unlikely hero in an outbreak. The patterns mathematical models identify can help researchers make predictions that have real practical value, such as how many cases there are likely to be, or what the efficacy of a quarantine is.

“There hasn’t been time for the sort of wet lab biological experiments to figure out all the characteristics of the virus itself,” says Kimberlyn Roosa, an epidemiological researcher at the Georgia State School of Public Health. “These studies can inform things like how long we think isolation should be, and how testing could affect when we should try and test individuals and isolate them to reduce the number of transmission.”

Mathematical models are often a dry and—at least in the media—overlooked area of research. Theory rarely draws the attention that experiment does. But when it comes to forecasting early in an epidemic, these models are often all there is to work from. Theory provides the only measure of predictability in an otherwise chaotic situation. By reducing the messy human world and viral distribution to simpler models and equations, by shrinking the aggregate of activity in a city to a parameter, a virus can be understood and eventually tamed.

It is not a hopeless endeavor. In many cases, the math seems to work unreasonably well. Laws of disease are not hard-coded into the fabric of our universe, as, say, the speed of light or Newton’s laws of motion. But patterns turn up unerringly, and the course of an epidemic can be tracked, explained, and even predicted with math from simple growth equations to complex, multivariable models.

This is not just a story about what the predictions made by researchers are, it is also about how the researchers made those predictions and which mathematical tools they used. In that sense, it is a story that aims to help the reader grasp the calculation of COVID-19.

And there has never been a more important time to understand the math.

The Data

COVID-19 is the first truly 21st century pandemic. Nowhere is that clearer than in the way data about the outbreak has been tracked down, collected, disseminated, visualized, and stored.

One of the first transmissions of data about COVID-19 occured on December 30, after Ai Fen, a doctor at Wuhan Central Hospital, received a test result for a patient. Ai informed her hospital, sent a photo of the test to colleagues with the critical information circled in red pen: “SARS coronavirus, Pseudomonas aeruginosa, 46 types of oral / respiratory colonization bacteria.” The image made its way to another doctor, Li Wenliang, who wrote “There are 7 confirmed cases of SARS at Huanan Seafood Market” to a group of 150 medical school classmates on WeChat, a popular messaging app in China. Screenshots of Li’s message circulated rapidly online, and on January 3, he was threatened with prosecution for “making false comments on the internet” by Chinese officials.

News about the infections quickly spread beyond China. The World Health Organization’s office in China was alerted to “pneumonia of unknown origin” on December 31, and by January 3, national authorities in China informed the WHO that there were 44 patients with symptoms.

At that same time, local officials ordered the destruction of virus test samples and China’s National Health Commission (NHC) prohibited publication of information about the virus. The Chinese government would sit on information about the virus’ genetic origins for a week, and it would be over two weeks before the government admitted that the virus spread from person-to-person. Starting on January 21, the NHC began to aggregate regional reports and publish plaintext daily updates with epidemiological information including confirmed infections, recoveries, deaths, and the number of close contacts tracked.

Data is the basis for all predictions about COVID-19. No pre-existing model, no matter how sophisticated, can work in the absence of data. Previous outbreaks can guide researchers, but they aren’t a substitute for what’s happening on the ground.

“The data we use is from those published online by the National Health Commission of China,” says Roosa. “So they publish daily and we have someone who daily goes on there and gets the data and sends it to us.”

When a team from Johns Hopkins University (JHU) launched the first global coronavirus tracker on January 22, the world map had only a few splotches of red—mostly in China. Now, enormous red circles blot out much of the globe. By mid-February, the site had over 140 million views, and due to its popularity, copycat sites with malware on them had popped up.

At first, the team updated the site manually, adding data to a Google spreadsheet, primarily from DXY.cn, a social network of physicians that has been tracking cases, aggregating reports from hospitals and municipal governments to provide a real time map of COVID-19 in China.

Data continued to pour in—from the WHO, the European Centers for Disease Control and more—so the JHU team automated most of the system to update every 15 minutes and moved the data to GitHub, a website typically used by programmers to host their code. This automation allowed it to give more timely updates than the WHO and Chinese CDC. On GitHub, the data is stored in spreadsheets that can be easily accessed by other researchers. Though the site is primarily used by coders and researchers handling data, people in China have even used GitHub as a censor-proof archive for news articles and personal accounts.

In the U.S., data on the amount of testing—which is critical—has been difficult to track down due to the lack of centralization, which has led to efforts like the COVID-19 Tracking Project.

Demand for data comes not just from policymakers and researchers, but everyone who wants to stay informed about the spread of the pandemic.

“In terms of who is using this dashboard, as far as I can tell it’s pretty much everybody,” said JHU epidemiologist Lauren Gardner in a presentation. “I think this really speaks to this huge demand for reliable, trustworthy, objective information—especially around situations like these.”

Uncertainties about data have been around since the dawn of epidemiology. When the Swiss physicist Daniel Bernoulli developed the first mathematical model to predict the spread of smallpox in 1766, he ran into problems with data provided by the British astronomer, Edmund Halley. Though Halley listed the number of children in Breslau who made it to one year old as 1,000, he did not provide information about how many had died in their first year. “It would appear that M. Halley wished to start with a round number,” Bernoulli wrote, evidently frustrated at the lack of precision.

Data collection is now modernized, organized in Excel spreadsheets and updated in real time, modern researchers still face many of the same uncertainties Bernoulli did. For example, many researchers are questioning whether the cases we’re seeing are really representative of the true number of infections.

“The data collected so far on how many people are infected and how the epidemic is evolving are utterly unreliable,” Stanford data scientist John Ionannidis wrote in an op-ed. “We don’t know if we are failing to capture infections by a factor of three or 300.”

Additionally, China’s report of zero new domestic cases has caused some to wonder aloud if Chinese officials are suppressing data, in part because of the initial cover-up.

But every country, not just China, is undercounting cases because there are a large number of asymptomatic and mild cases—estimates range from 18 percent to 50 percent. Many researchers believe that without universal testing, we will miss roughly half of all infections because the virus is mild or asymptomatic in many infected individuals. Conversely, this has made it more deadly: If those who were infectious were incapacitated, the virus would be much worse at spreading.

When testing was minimal in China, prior to January 23, researchers estimate that only 14 percent of cases were documented. Once testing ramped up in February, that number jumped to about 65 percent. In other words, minimal testing only saw the tip of the iceberg, and even rigorous testing left massive gaps.

Uncertainties about the data remain, but we are far from ignorant. With more testing, the data we have will only get better and so will our understanding of the virus.

Modeling pt. 1

A quote often attributed to the Danish physicist Niels Bohr, goes something like this: “It is difficult to make predictions, especially about the future.” In no small irony, even the quote’s origins are murky. Compared to Bohr’s work making predictions in the subatomic realm, modeling outbreaks is substantially trickier, though epidemiologists emphasize it has made leaps and bounds in the past decade—similar to weather forecasting.

“At one time hurricanes just arrived in Miami with little notice and there were no opportunities to prepare,” says Glenn Webb, a mathematical epidemiologist at Vanderbilt University. Just as meteorologists track the possible paths of a hurricane, Webb says his goal is to “predict the path of an epidemic.”

What those paths can look like depends a great deal on how infectious a disease is. Epidemiologists keep track of epidemics with a number they call R₀, pronounced “R-naught.”

“R₀ is just a very traditional parameter we use to estimate the number of infected cases generated by the primary cases,” says Qiao Fan, an epidemiologist at the Center for Quantitative Medicine in Singapore.

For example, an R₀ of 2 means that on average, one person infects two others; an R₀ of 1 means that one person usually affects only one other person. Below one, you reduce spread; above it, the disease can spiral out of control, into an epidemic. It is not a magic number, but a statistical proxy for contagiousness and exceptions do exist.

Determining what R₀ is is critical to understanding the virus. But calculating the value isn’t so straightforward—there’s nothing in the virus’ genetic code that gives an unambiguous answer. R₀ also depends on circumstantial factors, such as time of year and location. There are no lab experiments where infected volunteers cough near susceptible volunteers. Instead, researchers have to peer into the data and draw out a value for R₀.

One way of doing this is by looking at the overall growth rate of cases and trying to work backward to find the rate of growth. In the early days of a case, researchers do this by relying on messy and limited data. The problem is that the approach lacks precision. Especially when there are many unreported cases, looking at the growth of the total number of cases is a poor approximation. More targeted approaches to sussing out R₀ can use contact tracing, where researchers can look at infected individuals and count how many times they spread the virus on average.

So, what is R₀ for SARS-CoV-2? There are dozens of estimates, but most settle around 2.5 and modelers tend to use values around there as a baseline when making predictions.

At the beginning of the outbreak, multiple teams measured the outbreak in Wuhan to have R₀ closer to 4 or 5. This could have been due to inadequate data, but values that high are not totally out of the question. Under conditions like a 40,000 person potluck, the virus could spread far more prolifically. Notably, after China implemented the lockdown, R₀ went down, as low as .3 as each infected individual cut off routes of transmission. Some of the highest R₀ calculations were found for a cruise ship, the Diamond Princess. Estimates averaged around 6-7 prior to intervention. With ultra-dense conditions—roughly 24,000 people/square kilometer—the disease spread furiously.

Growth is also asymmetric, according to David Fisman, an epidemiologist at the University of Toronto. The characteristic shape of the number of cases in an epidemic has a steep slope upward, until a peak, and then a drop with a long tail.

“The level you get to before you shut this down is critically important not just because you had more cases getting there, but because the journey to the finish line is now that much longer, from that much higher a peak,” Fisman says.

At the beginning of an outbreak, when answers about the nature of the virus transmission and detailed data are scarce—the incubation period and number of asymptomatic infections are as yet undetermined—researchers rely on phenomenological models which seek mainly to reproduce the growth patterns present in the data.

“In the absence of sufficient data, it would be wise to have a model that has less model parameters involved,” says Jianhong Wu, an applied mathematician at York University in Toronto. “The phenomenological model is useful, especially when you don’t know or know much about the mechanism of transmission.”

More complicated models that rely on assumptions could go wrong, Wu argues, so fewer parameters means a smaller chance for error. These models can be as simple as an exponential curve with one parameter governing growth over time.

Some more complex phenomenological models try to apply the data to stochastic, or random processes that mimic the spread of a virus. In Victorian England, a major concern of the ruling class was that the aristocracy was dying out. “Surnames that were once common have since become scarce or have wholly disappeared,” the statistician and founder of eugenics, Francis Galton wrote in 1875.

To determine if this was a result of fertility problems or part of some sort of mathematical inevitability, he enlisted the help of a mathematician, the Reverend Henry William Watson, to make the calculation. Watson found that the nefarious force was math itself. If surnames were passed down patrilineally—from father to son—and each family had some chance of not having a son reach adulthood, the family tree of surnames would be continually pruned, leading to fewer surnames over time.

This branching process turns out to apply to far more than just English surnames. It has uses for describing the survival of a genetic mutation, the start of a nuclear chain reaction, and even the spread of viruses. For an epidemic, the branching process models how each individual creates a new generation of infections. When the branch goes extinct, instead of a surname going extinct, it means that the viral transmission is cut off. One way to rephrase key questions about how contagious a disease is in terms of this branching process.

“We do the same in epidemics—just how many of these branches will go extinct?” says Gergely Rost, a mathematical epidemiologist at the University of Szeged in Hungary.

Using basic assumptions like R₀, epidemiologists can calculate what the branching spread of an outbreak looks like. Because phenomenological models are simple, or rely on random spreading processes to guide them they “just kind of let the data do the talking,” according to Roosa.

“The downside to that is with the phenomenological models, we can’t assess different intervention strategies,” she says. “For forecasting purposes, the simple models are good, but for generating scenarios, not so much.”

Modeling pt. 2

Enter the compartmental model—so termed because it segments the population to better account for the mechanisms of spread.

“The fundamental idea is try to partition or stratify the entire population into different disjoint groups according to their epidemiological status,” says Wu. “Being susceptible, being infected, being recovered.”

Different models use slightly different variations. From the basic SIR (susceptible infected recovered) partition—some add an exposed category. These models are critical for generating long term predictions, far beyond what can easily be extrapolated with a simple phenomenological model.

“The usual way an epidemic is controlled is that so many people get infected, they have immunity,” says Webb. “There’s not enough susceptible people out there to support the epidemic.”

This is the calculus that has been filtered down to predictions like “40 to 70 percent of the world will be infected” or California governor Gavin Newsom’s claim that 25 million people in California will be infected, or those which fueled the UK’s initial strategy. These sky high numbers are not a realistic scenario, according to Roosa, because they assume little to no intervention.

But they make a certain kind of sense. From an epidemiological perspective, a virus only stops if there are no more susceptible people. For that to happen, it usually requires a huge reduction in the susceptible numbers either due to acquired immunity or a vaccine. In countries where coronavirus has been relatively controlled, like China, Taiwan, South Korea, and Japan, that is not the case. The vast majority of the population is naive, or not immune. Massive interventions have changed the course from the usual to a sort of artificially-tamped down level.

By subdividing the population, compartmental models offer a chance to understand the mechanics behind these interventions and more.

“We’re focused on the unreported cases, and the pre-symptomatic cases that are infectious,” says Webb “There’s no doubt that there’s people who are not showing symptoms that are infectious. And also there are many unreported cases. both are a little mysterious, but we wanted to include them because they’re definitely a factor in the transmission.”

Many compartmental modelers feel that their goal is not to pinpoint equations that may perfectly forecast the outbreak, but to capture the broader dynamics of the system.

“It’s not really about being as correct as possible. It’s more about like, ‘which mechanisms are sufficient to describe the data?’” says Ben Maier, an epidemiologist at Humboldt University in Germany. Getting the mechanisms right can provide better long term forecasting than a short term prediction which is right for the wrong reasons.

Some of these compartments have subdivisions. Wu’s model, for example, adds a “quarantine” compartment within the exposed compartment, and subdivides an exposed component into identified, confirmed, and hospitalized compartments. The goal is to reflect strategies being employed by countries like South Korea and Singapore. By creating compartments for “quarantined” and “identified” individuals, Wu’s model aims to capture the results of contact tracing, in which everyone who possibly came in contact with the infected is placed into quarantine until their test results came back.

So, which model should be used? Phenomenological? SIR? SEIR? Here, scientists tend to agree in unsatisfying consensus: No one model is necessarily the best; each has different strengths and weaknesses—as Wu puts it, the best model “depends on what types of issues you’re trying to address.”

The modeling situation is dynamic; predictions based on data only a few days old may be thousands of cases behind and a poor reflection of the current situation. Other models perform with remarkable accuracy weeks later.

We know this largely due to the way the research is being done and published. Normally, epidemiological research goes through a publication process that includes months of peer review. But to accelerate their research and conversations, scientists have been using preprint servers like MedRxiv to rapidly publish papers prior to peer review and the typical publication cycle. For Fisman, who remembers how slow publication led to a lagged response to the 2014 Ebola crisis, adoption of preprints has been a “godsend.”

As the latest round of predictions are rolling out, they are providing even more rigorous and sophisticated models of China, which has the most data, but critically, estimates about the outbreak in other countries—Singapore, South Korea, Japan, Italy, Iran, and the United States.

One conservative estimate for the U.S. based only on direct air travel from Wuhan to the U.S. found that by March 1, there could be as many as 9,484 cases. At the time of the preprint’s release on March 8, the U.S. was reporting only 500 cases.

Another key result: Interventions are massively important. A February 28 preprint estimated that the epidemic would level off around 81,000 total cases by March 21, but if interventions had been implemented one week earlier, the number would be about 6,000 total cases—ten times less. Conversely, if China had waited another week to intervene, due to the virus’ exponential spread, the total number of cases would be about 1.2 million.

There are, at this point, dozens and dozens of papers and preprints about the epidemiology of the pandemic, but two conclusions are inescapable: interventions must be sufficiently restrictive and they must come early. If they fail in either regard, many, many more people will become sick and die.

China

What we know about the beginnings of SARS-CoV-2, the technical name for the virus, is that it originated in a non-human animal, and that it has great genetic similarity to coronaviruses found in regional bats. At some point in November, the infection was passed on to a human, and it spread from the now infamous Huanan Market, in Wuhan.

In December, as people began to fall sick, local doctors noticed something odd. They began to put the pieces together and send out warnings to their colleagues. At the end of December, 8 doctors including Li Wenliang attempted to get the word out. Their attempts were silenced, and the response was stalled. A month later, Li would die from coronavirus.

Xi Jingping was well aware of the epidemic early on, which he discussed in an internal meeting on January 7, though it would be 13 days until he made public comments. As of January 23, there were still fewer than 1,000 confirmed cases in Hubei province. Wuhan was placed under lockdown and severe travel restrictions placed across China.

As of this writing, China has had a total of 80,000 confirmed cases and many of those cases have resolved and the victims have recovered. China therefore offers an almost complete trajectory of COVID-19 through a country—and there is much to learn.

We know now that the very first estimates of spread in China by Imai et al. and Chinazzi et al., which predicted thousands of untested cases were largely correct: Under minimal testing and no interventions, the virus had been spreading, well, virulently.

Over the next few weeks, multiple teams attempted to predict the cumulative number of cases in China. Some used a phenomenological model while others used a mechanistic model. Almost all teams ended up being off, in no small part due to the fact that China changed the way it counted cases from February 13 to February 19. The change caused an unexpected spike of about 15,000 cases, mostly centered in Hubei.

“Originally they were reporting only lab-confirmed as their confirmed cases,” says Roosa. “And then on February 13, they changed it to include all those who also had clinical symptoms and hadn’t been confirmed yet,” Less than a week later, they went back to counting only lab-confirmed cases. Roosa’s model, which was accurate for the rest of China, estimated the number of cases in Hubei by Feb. 24 would be 37,000. The actual number of reported cases was 65,000.

Several models by non-epidemiologists attempting to predict the final total ended up lowballing the number of cases around 45,000–50,000. Estimates by epidemiologists were closer: 63,600 for Maier’s model, about 63,000 for Wu’s model.

It is worth reiterating that these curves which level off, suggesting an end to the epidemic in China, are not natural. Only when China implemented a lockdown on Wuhan and quarantine on the rest of China did the situation change. Wu found that prior to lockdown, R₀ in Hubei was about 6.4; a week later it was 1.6; a week after that it was .4. Maier saw similar results: the number of infected grew exponentially until growth hit the brick wall of quarantine.

Lockdowns and quarantines were initially mocked or criticized by many Western researchers who felt them unfeasible. How could such measures really work? How could a full quarantine work on 1.2 billion people, and what would be the point if the population would still be naive afterwards? Meanwhile, many Chinese epidemiologists had worked on these papers while quarantined inside their apartments.

Wu, who was born in China, credits the success of the interventions to a culture of resilience. Other countries with democratic governments like South Korea and Taiwan have also implemented their own version of the quarantine/suppression and contact tracing to keep numbers low.

“These measures were extreme and we viewed this as a very drastic change in transmission that occurred at a certain date,” says Webb “Before that, the transmission was constant, because there was an exponential growth phase.”

When the measures were implemented were the most important, as Webb’s Feb. 28 paper showed. Roughly speaking, interventions a week earlier would have resulted in 1/10th the number of cases, but waiting a week later would have resulted in 10x the number of cases. Other effects, like the ratio between the number of reported and unreported cases were much less consequential, only doubling the total cases. (More unreported cases implies a hard to track population that is nevertheless, infectious.) Similarly, changing the incubation period, lessening the amount of time between when people were infected and started being infectious themselves produced little change in the cumulative cases.

Getting to these measures is no small matter. It requires an immense amount of political capital. When I asked Fisman about the feasibility of implementing quarantines on March 8, he emphasized the difficulty.

“Oh my God, you can’t do that unless people are terrified,” he says. Conditions often already have to be bad enough that “they’re ventilating people in hallways, their hospitals are overfilled and people are dying. What you’re looking at with this disease is situations where you literally cannot care for people.”

More contagious than the flu, SARS-CoV-2 is also far deadlier. If the epidemic peaked in Canada, Fisman estimates that roughly .7 percent of the adult population would require a ventilator. Canada has roughly 30 million adults, so 210,000 would require a ventilator. No country in the world has close to that many ventilators—a 2018 report estimated that at best, the U.S. could ventilate 160,000 patients.

The case is clear: Our only option to avert this dire scenario is massive suppression measures to reduce R₀ and force the outbreak’s growth down to manageable levels, at which point it is possible to switch to contact tracing and containment.

“It requires massive social distancing. That’s no football matches. That’s no concerts. That’s no school. That’s no working from the office. Anyone who can work from home is working from home and you’re shutting places down,” says Fisman. “The difficulty for Italy and for all of us, is you can do that—you can mobilize political capital to do that, when you’re in the middle of the crisis, and people are seeing deaths around them. When it’s more effective is before. That’s the problem.”

If suppression is achieved, then contract tracing and containment strategies can kick in. As Wu describes it, the strategy relies on teams who do quick tests to identify the infected, who are put into isolation. Then, investigators search for anyone the infected individual might have been in contact with while contagious. A February 11 preprint found that, given R₀ of 2.5 to 3.5, roughly 70 to 90 percent of contacts would have to be traced in order to keep outbreaks suppressed.

However, contact tracing also depends on the number of presymptomatic and asymptomatic cases and how much these cases—which are temporarily or completely unreported—drive infections.

Within China, differences between provinces are stark. While Wuhan City in Hubei was the epicenter of the crisis, the virus spread throughout China, necessitating intervention measures nationwide. Unlike Hubei, the virus did not get the chance to grow exponentially in other provinces. Interventions prevented that, and the growth remained relatively low.

These infections in other provinces did not necessarily act in expected ways. For instance, Heilongjiang, a province in the Northeast of China—far away from Wuhan—had the highest infection growth rate of all other provinces. Conversely, Beijing, an ultra dense city, had the lowest growth rate. According to Fan, this was a result of stricter measures in Beijing than other areas.

Tianyi Li, a Ph.D. candidate at MIT’s Sloan School of Management, used his experience modeling complex systems to come up with a network that looked at how transportation modes led to spread across China.

“Especially in China, where the public flow is massive, you have to consider the local dynamics,” says Li. “On different transportation media, the transmissivity is different.” For example, on planes, fewer people talk, leading to less transmission. But on trains, talking amongst neighbors is common, which could lead to high transmission.

There are other differences between modes of transit. Notably, neither cars and planes have path overlap. But on a train, say, from Shanghai to Beijing, passengers will board at stops along the way. Path overlap greatly increases the chance of cross-infection, allowing the virus to disseminate widely and quickly. Li found that big cities accounted for 15 percent of transit-driven infectious, while small cities contributed nearly zero cases—it was all driven locally after an infection was imported.

One other important transit factor is the time of year: around January, Chinese New Year, hundreds of millions travel. By many estimates, it is the largest holiday travel on the globe. While some of that travel was cut short by restrictions, but some of it had already happened in January before things were locked down.

An important value to determine from the data is the incubation period, or the time when a person is infected, but doesn’t show symptoms or is infectious. Coronavirus has proven quite short with an incubation time of roughly 5 days.

Elsewhere

“There’s this obsession early in an epidemic with where the cases came from,” says Fisman.

When King Charles VIII of France invaded Italy in 1494, his soldiers were stricken with a plague. After they returned home, they spread the disease far and wide. The French called it the Italian disease; the Italians called it the French disease. For the Dutch, it was the Spanish disease, for the Russians, the Polish disease, and for the Ottoman Turks, it was the Christian disease. We know it today as syphilis.

Enmity can spread much like a virus, but there’s much to learn from other countries. In particular, exported cases can be a powerful indicator of what the spread is from the originator country. This is what Imai et al. and Chinazzi et al. did to predict cases early on in China, and it’s what Fisman’s team has able to do it for Italy and Iran. From 46 exported cases abroad and travel data, the researchers estimated that the true number of cases in Italy on Feb. 29 was roughly 4,000, not the 1128 that were counted. For Iran, they found a much larger discrepancy: 18,000 estimated cases against only 43 reported cases, as of February 23. A Hong Kong-led team came to a similar conclusion, estimating around 16,500 cases on February 25. Both countries then underwent an explosion of confirmed cases and deaths.

“Someone’s termed it ‘forensic epidemiology,’ which I quite like,” says Fisman. “When it’s a new outbreak and people are catching up with testing and they don’t know what the extent of the outbreak is inside the country, you can indirectly estimate what the size must be if you have access to travel volumes.”

It’s a little like figuring out how much confetti was dropped at a party by counting the number of people at nearby restaurants with paper stuck to them. For Fisman, the takeaway is not that there are exactly 4,000 cases in Italy, it’s the the exported cases suggest Italy is missing a ton—maybe 75 percent of their cases.

Some countries, like Singapore, have proven remarkably resilient, and thus far avoided exponential growth, which several researchers attribute to diligent contact tracing, among other measures. One odd data point: Singapore has not required masks, which has surprised a number of researchers, including Jin Cheng, a mathematician at Fudan University in Shanghai. Cheng attributes a substantial amount of China’s success at quashing the epidemic to masks, but admits that Singapore seems to be succeeding with a different strategy.

What’s the best way to reduce a country’s risk of an outbreak? A team of researchers at Szeged University in Hungary led by mathematical biologist Gergely Röst, estimated the best way to reduce risk while the virus was mainly in China.

“You achieve the largest reduction of your risk with the smallest effort,” says Röst. “If you have very low connectivity to China, for example, but very high local transmission potential, then the best thing to do is you reduce your connection more with China.”

He gives the example of a soccer player who has strengths and weaknesses. According to Röst, it’s better for countries to focus on their strengths—low connectivity to countries with infection, or low local transmission potential—than try to shore up weaknesses.

South Korea has proven its strength at reducing transmission. The country was on the way to out of control growth at the end of February, when nearly 1000 cases were reported on a single day. But by mid-March, South Korea had flattened the curve, with fewer than 100 cases reported per day. How? One strategy has been to use high-tech contact tracing, leveraging masses of mobile data to track where and who infected people might have been in contact with. The approach has raised concerns about privacy, but the results are hard to argue with.

Transmission may also be impacted by demographic-level differences. A February 27 preprint by researchers from the University of Warwick predicted reduced transmission across Africa, central America, the Middle East and India, but high rates in Europe and Japan due to the age of the population, because older individuals are more susceptible to SARS-CoV-2.

Another factor can be temperature. The more time people spend indoors, the more likely they are to catch a virus, and cold weather is known to lower the strength of an immune system. Even so, it’s not everything. Countries like Singapore have done well primarily because of their intervention measures, says Wenbin Chen, a researcher on the Fudan University team.

A paper by a Portugeuse team found that temperature and humidity contributed only 18 percent of the epidemic’s variance. That is, the difference in viral spread between regions in China was only partially due to differences in temperature. A standard doubling time—the duration it takes for confirmed cases to double—at 20 Celsius might be 5 days, but at a steamy 40 Celsius, the doubling time should increase to roughly 7 days, slowing down growth.

Fatalities

There are now thousands of people dying from the virus every day. Many are old and infirm, or immunocompromised. Others are young and healthy, like Li Wenliang. When patients die, they are converted into a numeral, shifted from one column to the next. A journey in human suffering across three compartments: susceptible, infected, deceased.

In general, epidemiologists don’t make predictions based on deaths.

“This mortality rate is not necessarily just a scaled down rate of the infection with certain delay—it’s not that simple. It’s much more complicated,” says Wu.

Comorbidities have powerful effects on fatalities, as do age and even gender. A February 27 preprint tracking 1,590 patients in China found that nearly a quarter of patients had comorbidities such as smoking or hypertension. Men accounted for nearly 60 percent of patients with comorbidities, and patients with a comorbidity were about 15 years older than those without.

Differences exist between countries as well. China’s total fatalities have been passed by Italy, though Italy has fewer confirmed cases. According to Wu, there are two main reasons for this much higher mortality rate. First, the number of seniors is far higher in Italy than in China. The average age in Italy is about 8 years older, which has led to devastating numbers of octogenarians succumbing.

The other is the availability of public health. In China, healthcare providers poured into Wuhan and temporary hospitals were constructed in a week. Doctors in Italy are being forced to make terrible choices, triaging patients. In some places, there are so few ventilators that if a patient was over 65 they will not receive one.

If the virus is not controlled with intervention measures, a March 16 Imperial College study found that 2.2 million people in the U.S. die.

Conclusion

At this point, it is time to address a caveat. Mathematical models, even when they succeed, will not capture or predict many important intangible outcomes. There are social and economic costs of quarantines that no SEIR model is capable of approximating. Neither can they assess the emotional toll from the loss of life.

Even factors that are in principle, capable of being captured—varying degrees of industrialization, healthcare payment models—complicate matters and prevent easy mathematical generalizations.

In 2003, the SARS epidemic hit Toronto after an elderly woman contracted the disease in Hong Kong. The city implemented strict protocols in hospitals and isolated infectious individuals, allowing it to control the virus. But when the virus seemed to be gone, and the city released the protocols, the outbreak came roaring back, eventually infecting several hundred people.

For Fisman, a Torontonian, 2003 is not the proper historical precedent. “None of us were around in 1918. I think these are unchartered waters,” he says. The number of cases countries are currently missing is a strong indicator of how difficult the virus will be to control.

Other researchers have similarly dire outlooks.

“In the very long term, you somehow have to reduce the number of susceptibles. Either by vaccination or they just contract the disease,” Röst says.

“My feeling and again—keep in mind I’m a mathematician—my feeling is this problem cannot be solved until we have the vaccine widely available,” Wu says.

How to write an email to a researcher you’ve never spoken to before

Since I’ve gotten asked about careers in science writing/journalism twice in the past week, I’ve been hunting down basic resources (what is science writing, how to pitch, where a science writing career stats) from excellent sites like The Open Notebook to help get folks started.

But this is a particularly basic question—so basic that people usually don’t ask it and (IMO) it doesn’t get a lot of good answers. Here’s my take.

SUBJECT LINE

Media Inquiry: Interesting Research

You want to clearly label your email as a media email—ideally from a specific publication, but if you’re a freelancer and not sure where it will appear, “Media” is just fine. You also want to make the topic of the email clear. Specific keywords that are relevant to their specific research are often helpful. For example, it might be better to include “Penrose process” than just “black hole” in the subject line. A more specific topic is more relevant to them and means your email is more likely to be read.

GREETING

Dear Dr. So and So,

Titles can be tricky. On first contact, I always use Dr. (as opposed to Prof.) unless I am positive they don’t have a PhD. If there are three people or fewer, use Drs. If for some reason there are more than three you can address it to “all.” Keep in mind that you generally want to avoid sending a single email to more than 3 or so researchers—things get messy. (One or two really is best.) Make sure to double check that you have spelled their name(s) correctly before sending.

INTRODUCTION

My name is Dan Garisto and I’m a freelance science journalist currently on assignment with Such and Such publication writing about [topic of interest].

You want to convey who you are and what you’re knocking on their door about, generally within a sentence or two. Often you’ll want to add a clarifying sentence about the article you’re writing.

In particular, I’m hoping to give readers a glimpse of [topic] from [relatively under-reported angle].

Sometimes, but not always, you’ll want to prove your credentials upfront with the appropriate links.

I’ve previously written about [topic] here, here, and here.

REASON FOR CONTACT

I’m reaching out because of your work on [topic of interest], especially [somewhat recent paper].

In some ways, this is the most important sentence of your entire email. It’s one thing to receive a cold email from a science writer asking to talk; it’s another if they link to a highly specific (and relevant!) paper you published 18 months ago which has 3 citations. Linking to their relevant research demonstrates that you’ve actually done your homework. It’s an investment of your time into them; it shows you have genuine interest. They are so much more likely to respond if you do this.

Another possible reason:
I’m emailing because So and So said you were the expert to talk to about [topic].

Slightly less good:
Your university bio said you had expertise in [topic] and [related topic].

ASK

I was hoping to speak with you about topic.

This is maybe the least important sentence of the entire email. Don’t spend too much time on it. That you want their time is implicit; how you explicitly state that you want it is somewhat less important. That said, a couple variants to keep in mind:

I was wondering if you’d be willing to look over [forthcoming paper from another researcher] and share your thoughts with me.

Rather than emailing multiple people, it’s often easier to put this request in the ask. Also a good way to diversify your sources.
Would you or someone in your lab/one of your coauthors have time to chat?

LOGISTICS
My schedule is pretty flexible later this week and I’m available via Skype/Zoom/phone. Could you let me know if there are any times that work for you?

Be clear about your availability, but on the first email, don’t list every time that you’re available. It’s messy and presumes a bit too much. Sometimes you’re in a crunch. Be upfront about that too.

Unfortunately I’m on deadline and I really need to get a draft to my editor by tomorrow morning or she’ll have my hide. I know this is a tough ask, but would you have time later today?

There are dozens of other permutations here, but the important thing is to remember to be gracious. Nobody owes you their time.

CLOSING

Looking forward to hearing from you.

This one is totally up to you. “Thanks for your time” works just as well.

Best/Sincerely/Cheers/Regards/Toodlepip,

Dan

PUTTING IT ALL TOGETHER

Media Inquiry: Interesting Research

Dear Dr. So and So,

My name is Dan Garisto and I’m a freelance science journalist currently on assignment with Such and Such publication writing about [topic of interest]. In particular, I’m hoping to give readers a glimpse of [topic] from [relatively under-reported angle].

I’m reaching out because of your work on [topic of interest], especially [somewhat recent paper].

I was wondering if you’d be willing to look over [forthcoming paper from another researcher] and share your thoughts with me.

My schedule is pretty flexible later this week and I’m available via Skype/Zoom/phone. Could you let me know if there are any times that work for you?

Looking forward to hearing from you.

Cheers,
Dan

I’ll update this later if I think of stuff. But for now, that’s it.

democracy_final2_final.pdf

This was not the election of a healthy democracy. A complete list of the undemocratic measures taken before, during, and after this election would stretch for pages. The proximate cause for so many—including the attempted coup—was Donald Trump. Following his lead, Republican voters and officials, all the way up to sitting senators, have conjured up a phantasmagoria where the only explanation for defeat is a conspiracy of rampant voter fraud around every corner. Where democracy is defined as a subset of the population; a herrenvolk. Theirs is a fever dream motivated by an explicit rejection of both democratic values and the intransigent reality of the ballot box. At roughly 31 cases in a billion ballots, voter fraud is astronomically rare. Voter suppression of minorities, meanwhile, remains commonplace.

It would not be unreasonable, in spite of efforts to throw out votes, intimidate election officials, spread lie after lie after lie until blood was spilled on the Capitol steps, to believe that this was a more undemocratic election than most in the nation’s history.

I want to first be clear about what I am not saying. I’m not trying to issue panglossian polemic, in the style of Steven Pinker. I am not here to acknowledge wrongs American electoral history only insofar as they serve to illustrate a vague ideal of neoliberal progress. I am, frankly, not abundantly optimistic about the future of American democracy, which is likely to remain yoked to undemocratic institutions like the Senate and Electoral College for the foreseeable future.

That said: I’d like to make the case that in spite of it all, this was in fact the most democratic general election in the history of the United States.

A caveat of sorts: I’m not a historian, and my work as a journalist is mostly restricted to science—particularly physics. In short, this is not my expertise. But seeing as the data bears it out, and nobody else seems to have forcefully articulated the point, I thought I’d try my hand at it.

Total vote as a percentage of population (blue) and winner’s vote as a percentage of population (red) for U.S. president. Note the increases around 1820, 1865, and 1920.

If you wanted to tell the story of democracy in America over the past 231 years—all of its fits and starts, flaws and virtues, banalities and oddities, triumphs and tragedies—you could do worse than the following graph.

What this chart illustrates, perhaps reductively, is that American democracy has not always been so. Elections today bear little resemblance to those a century ago, let alone two centuries ago. We the people were not we the voters.

With absentee ballots finally counted, President Joe Biden has topped 81 million votes, which is not only the highest total ever, but also—by a small margin—the highest total as a percentage of population. Perhaps more importantly, the total vote in 2020 nearly scraped 50%, about 5% higher than the previous record in 2008.

In the sense that democracy refers to “a system of government by the whole population or all the eligible members of a state, typically through elected representatives,” elections closest to the democratic ideal are those that in which more of the population votes, not less. 2012 was a more democratic election than 1912; 1912 was more democratic than 1812. This is a simple, mathematically ineluctable definition without caveat or context, but it works.

And it works because increases in participation are so deeply entwined with expansion of the franchise that we can see, in the dips and rises of that graph, milestones of suffrage: universal white male suffrage in the 1820s, the partial success of the 15th Amendment after the Civil War, the resounding increase on the passage of the 19th Amendment, and the impact of Civil Rights legislation with which the U.S. first became a multiracial democracy.

There are plenty of confounding variables, and enormous setbacks remain. Roughly 5 million Americans remain disenfranchised due to a felony. Shelby v. Holder needs to be counteracted with a new VRA. Automatic voter registration and the repeal of voter I.D. laws are a must.

Electoral politics are not the be-all end-all of a democracy, and maybe not a “lifeblood” (or even hemolymph) but they are a sort of transmission fluid—a substance that allows for the continued maintenance and survival of the system. Better that it flow freely.

I’ll hopefully update this later with some more historical context to flesh out the point, but for now, it’s democracy.in.america_final2_final.pdf.

Lorem ipsum yadda yadda

The first blog post on a new website is probably more akin to the befouling of a pristine litterbox than the Platonic ideal of a writing: the crisp scrawl of a black pen across ecru pages of a notepad. (If I knew more about typewriters I’d throw their aficionados a bone, but alas.)

And yet, it must be done. Especially if your new website comes with it all set up ahead of time like some sort of Calvinist imperative to blog.

Anyways, here’s to future words.

Oh, and I took this picture of a ground story window somewhere near Stuyvesant Village on the Lower East Side. I’m afraid there’s no context for it, but I am fond of it.