A focus on accuracy in geopolitical forecasting

Penn Integrates Knowledge University Professors Philip Tetlock and Barbara Mellers have spent decades researching the myriad complexities inherent in forecasting future events. Their interdisciplinary work has come to fruition in a model that not only yields the most accurate geopolitical forecasting on record, but is the only one with empirical evidence to support it.

“We didn’t know what to do to improve forecasts, so we did what came naturally, and that was to run experiments,” says Mellers, who, like her husband Philip Tetlock, holds faculty appointments in the School of Arts & Sciences and the Wharton School.

Though running experiments may sound like a conventional first step, it was actually a novel thing to do, as no one had submitted forecasting techniques to any kind of testing before. This lack of rigor has led to the prevalence of biased, vague forecasts made by pundits who evade accountability—and as Mellers points out, these are the very forecasts on which politicians base decisions that put billions of dollars and thousands of lives at stake.

In response to this inadequacy, Intelligence Advanced Research Projects Activity (IARPA), the research arm of the Office of the Director of National Intelligence, planned a five-year forecasting tournament to identify the factors that measurably reduce bias and improve accuracy—two sides of the same coin.

Using findings that Tetlock published in his first book, “Expert Political Judgment,” to organize the tournament, IARPA objectively scored thousands of participant predictions on topics ranging from potential military conflicts to the movements of refugees and the spread of pandemics. Tetlock and Mellers’ team, coined the Good Judgment Project (GJP), outperformed all other teams in the tournament by a margin so wide that after two years, IARPA dropped all the other teams.

Tetlock’s new book, co-written with journalist Dan Gardner, “Superforecasting: The Art and Science of Prediction,” tells how they did it and the discoveries they made that are not only poised to transform forecasting, but also, according to Harvard psychology professor Steven Pinker, “have radical implications for politics, policy, journalism, education, and even epistemology—how we can best gain knowledge about the world.”

The book gets its title from the term GJP uses to refer to the top 2 percent of their performers—“superforecasters”—an elite subset that outperformed the average tournament forecaster by more than 50 percent. Not only were their predictions significantly more accurate, Mellers says, but they expressed themselves with more precision—and assessments showed them to be much more open-minded, as well.

Tetlock and Mellers’ findings about achievable levels of accuracy and the critical importance of open-mindedness raise the bar for forecasters and holds considerable promise for depolarizing political debates. Tetlock calls this application of their research “a key part of our agenda for future work.”

The “radically experiential” nature of forecasting tournaments, as Tetlock calls it, also has much to offer higher education. Mellers notes, “Forecasting is an important part of many jobs—a physician giving a prognosis or a lawyer anticipating courtroom questions, for example—but not a part for which there’s direct training. This tournament suggests that there are ways we can become better at forecasting and apply what we learn to other disciplines.”

She and Tetlock believe that building forecasting tournaments into the curriculum could transform higher education, giving graduate students an opportunity to join the deep discipline-based and broad interdisciplinary knowledge they acquired as undergraduates, and put this multidimensional learning to the test by applying it to questions about pressing geopolitical issues.

Tetlock and Mellers are quick to point out their debt to colleagues from different disciplines for GJP’s success in the tournament. These include Penn Associate Professor of Political Science Mike Horowitz, who generated questions; Project Manager Terry Murray, who is now the CEO of Good Judgment, Inc., a commercial spinoff of the GJP; and the faculty who developed the statistical algorithms upon which the forecasts were based— Penn Psychology Professor John Baron, Penn Computer and Information Science Associate Professor Lyle Ungar, and Rice University statistics professor David Scott.

Tetlock and Mellers also credit the tournament’s two IARPA managers, Jason Matheny and his successor, Steve Rieber, for bucking “the basic laws of Washington bureaucracy” to not only pursue an innovative approach to geopolitical forecasting, but also advocate for GJP in the intelligence community and the federal government.