Tech Insider

Robot Hands surrounding a crystal ball with ChatGPT logo.
  • Economists, hedge fund investors, and tech executives compete in a forecasting contest each year.
  • OpenAI's ChatGPT participated in the 2025 game for the first time.
  • The competition tested AI's ability to make predictions without clear online content as a guide.

The ability to forecast the future is a valuable sign of intelligence and a good test of AI's capabilities. How good is ChatGPT at prediction?

An answer to this fascinating question emerged recently when economist David Seif wrapped up an annual forecasting contest he runs for a secret group of economists, hedge fund investors, and tech executives.

In its seventh year, the challenge requires contestants to predict roughly 30 events. The 2025 game kicked off in late 2024, when Seif sent out the list of events to predict in fields such as politics, business, science, economics, pop culture, and sports.

One question asked the contestants to forecast whether Taylor Swift and Travis Kelce would announce their engagement by April 1. Another: Would Bulgaria adopt the euro as its official currency on or before July 1?

Sam Leffell, a director at a hedge fund firm, was filling out his probabilities in December and had an idea.

"When I was answering the questions, I had the ChatGPT screen up. I wondered, what it will say to these questions?" he recalled in a recent interview.

ChatGPT had to learn complex rules

Leffell reached out to Seif to ask if ChatGPT could take part, and Seif said, go for it. So, Leffell got started by pasting the game's rules into ChatGPT.

These are complex rules, covering multiple pages. Contestants must assign a percentage based on the likelihood of each event happening. As the results come in over the year, these predictions are scored a bit like golf. The lowest score wins.

"You get points equal to the square of the difference between what you put and the results," Seif said.

For example, if you assign a 90% chance of something happening and you get it right, you get 10 points. That number is squared, resulting in a total of 100 points. Excellent work.

The opposite is more painful. If your 90% probability event doesn't occur, you are stuck with the difference between 90 and zero. That 90 score is then squared for a total of 8,100 points. Ouch.

And this is only the scoring system. There are whole pages of rules on other aspects of the game. Leffell pasted all this into ChatGPT.

A few seconds later, the AI chatbot responded, "Thank you for providing the detailed rules of the forecasting contest. Please share the clean list of prompts for which you need a probability estimate, and I will provide a single number for each as per the contest's guidelines."

Leffell pasted in all 30 questions at once, and ChatGPT quickly replied with its percentage probabilities for each event. Leffell sent those to Seif, who entered the responses on ChatGPT's behalf.

Even while setting this machine-prediction experiment up, Leffell noticed something intriguing.

"For one question, related to an NFL wild card outcome, it gave a mathematical response that was statistically correct," he said. "It was doing math rather than qualitative stuff. That was notable because ChatGPT, at the time, was not supposed to be good at math."

ChatGPT makes predictions

As 2025 began, 160 contestants had submitted their predictions and began waiting for the future to unfurl.

This is when I first heard about the game through friends of mine who were participating. One is a hedge fund manager. The other two are a chief marketing officer and a lawyer.

They became insufferable at parties, discussing their various forecasts, along with the intricacies of the scoring system and other rules.

It's the type of conversation that bores me to death. However, when one friend mentioned that ChatGPT was taking part for the first time, I got hooked.

Could a machine outperform 160 humans in predicting all these events? AI models are great when there's existing data. When the future's involved, there's a lot less information to lean on.

I'd recently tested ChatGPT's stock market forecasting ability. Could it excel at this more complex challenge, or are humans uniquely adept at foreseeing the future through experience, extrapolation, and intuition?

As the year progressed, some events occurred, and others didn't. Some happened too late, while others developed in weird, unexpected ways. As life does.

Each time a question was resolved, Seif updated a central spreadsheet and sent a ranking to all the contestants.

My friends seized on every update. Who was winning? Who was lagging? And most of all, where was ChatGPT ranked?

Strange symmetry

The game wrapped up on November 13.

"For the first time in the seven years we've run the contest, I pulled off the win myself," Seif wrote in his final email update of the 2025 competition.

ChatGPT came 80th, he wrote, "and we had 160 players."

Strange symmetry. I immediately texted my friends: This means ChatGPT is no better than the average human! Not very impressive.

One of my buddies, the CMO, replied: No, this means ChatGPT is as good as the average human. Incredible!

ChatGPT missed a benchmark

I asked Seif about this, and he had a different way of measuring ChatGPT's predictive power, or lack thereof.

If you'd put a 50% probability for each event happening, you'd have gotten 75,000 points. That's Seif's benchmark for whether contestants added value or not.

ChatGPT got 82,925. So it missed that benchmark, essentially adding negative value, according to Seif.

When there was a lot of existing data to help with forecasting and calculating probabilities, ChatGPT did better, he said.

For instance, the chatbot analyzed this event well, giving it a 70% chance of happening: The winning team of the FIFA Club World Cup is from the European Union.

ChatGPT performed worse when there was a lack of data, or it missed new information that altered the likelihood of an event occurring.

For example, the chatbot assigned a 95% chance of this happening: Astronauts Suni Williams and Butch Wilmore safely return to Earth by March 1.

By the end of 2024, news announcements made it clear that this rescue mission was highly unlikely to happen by March 1, 2025, Seif said.

"ChatGPT just wasn't up with the news on that one," he added.

Maybe ChatGPT won?

Leffell, the hedge fund manager who entered ChatGPT in the game, drew different conclusions and shared some important caveats.

He asked ChatGPT to make these predictions in December 2024. OpenAI's chatbot has improved since then, so its forecasting ability may be better now. Better prompting may have also helped ChatGPT perform better.

Leffell also said that ChatGPT took only a few minutes to understand the complex rules of the game and make 30 predictions—a lot faster than most human contestants.

Leffell himself spent many hours, over several days, to understand the questions and research the events, coming up with his own probabilities.

"It did better than half the people, and it spent a lot less time than everyone else on the challenge," he told me. "If you look at results per minute of work, maybe ChatGPT won?"

As an investor, he's in the business of assessing as many probabilities as possible, so ChatGPT and similar AI tools have become essential, he said.

"What if you are not having to predict 30 events quickly, but 30,000 events instead? What if it's good enough at making all these predictions quickly?" Leffell said.

"It's become ubiquitous in everything I do, in my personal life and at work," he added. "We're using it a lot. ChatGPT is table stakes at this point."

Sign up for BI's Tech Memo newsletter here. Reach out to me via email at abarr@businessinsider.com.

Read the original article on Business Insider