The real research behind the wild rumors about OpenAI’s Q* project (2024)

The real research behind the wild rumors about OpenAI’s Q* project (1)

On November 22, a few days after OpenAI fired (and then re-hired) CEO Sam Altman,The Information reportedthat OpenAI had made a technical breakthrough that would allow it to “develop far more powerful artificial intelligence models.” Dubbed Q* (and pronounced “Q star”) the new model was “able to solve math problems that it hadn’t seen before.”

Reuterspublished a similar story, but details were vague.

Both outlets linked this supposed breakthrough to the board’s decision to fire Altman. Reuters reported that several OpenAI staffers sent the board a letter “warning of a powerful artificial intelligence discovery that they said could threaten humanity.” However, “Reuters was unable to review a copy of the letter,” and subsequent reporting hasn’t connected Altman’s firing to concerns over Q*.

The Information reported that earlier this year, OpenAI built “systems that could solve basic math problems, a difficult task for existing AI models.” Reuters described Q* as “performing math on the level of grade-school students.”

Instead of immediately leaping in with speculation, I decided to take a few days to do some reading. OpenAI hasn’t published details on its supposed Q* breakthrough, but ithaspublished two papers about its efforts to solve grade-school math problems. And a number of researchers outside of OpenAI—including at Google’s DeepMind—have been doing important work in this area.

I’m skeptical that Q*—whatever it is—isthecrucial breakthrough that will lead to artificial general intelligence. I certainly don’t think it’s a threat to humanity. But it might be an important step toward an AI with general reasoning abilities.

In this piece, I’ll offer a guided tour of this important area of AI research and explain why step-by-step reasoning techniques designed for math problems could have much broader applications.

The power of reasoning step by step

Consider the following math problem:

John gave Susan five apples and then gave her six more. Susan then ate three apples and gave three to Charlie. She gave her remaining apples to Bob, who ate one. Bob then gave half his apples to Charlie. John gave seven apples to Charlie, who gave Susan two-thirds of his apples. Susan then gave four apples to Charlie. How many apples does Charlie have now?

Before you continue reading, see if you can solve the problem yourself. I’ll wait.

Most of us memorized basic math facts like 5+6=11 in grade school. So if the problem just said, “John gave Susan five apples and then gave her six more,” we’d be able to tell at a glance that Susan had 11 apples.

But for more complicated problems, most of us need to keep a running tally—either on paper or in our heads—as we work through it. So first we add up 5+6=11. Then we take 11-3=8. Then 8-3=5, and so forth. By thinking step-by-step, we’ll eventually get to the correct answer: 8.

The same trick works for large language models. In afamous January 2022 paper, Google researchers pointed out that large language models produce better results if they are prompted to reason one step at a time. Here’s a key graphic from their paper:

The real research behind the wild rumors about OpenAI’s Q* project (2)

This paper was published before “zero-shot” prompting was common, so they prompted the model by giving an example solution. In the left-hand column, the model is prompted to jump straight to the final answer—and gets it wrong. On the right, the model is prompted to reason one step at a time and gets the right answer. The Google researchers dubbed this technique chain-of-thought prompting; it is still widely used today.

If you read ourJuly articleexplaining large language models, you might be able to guess why this happens.

To a large language model, numbers like “five” and “six” are tokens—no different from “the” or “cat.” An LLM learns that 5+6=11 because this sequence of tokens (and variations like “five and six make eleven”) appears thousands of times in its training data. But an LLM’s training data probably doesn’t include any examples of a long calculation like ((5+6-3-3-1)/2+3+7)/3+4=8. So if a language model is asked to do this calculation in a single step, it’s more likely to get confused and produce the wrong answer.

Another way to think about it is that large language models don’t have any external “scratch space” to store intermediate results like 5+6=11. Chain-of-thought reasoning enables an LLM to effectively use its own output as scratch space. This allows it to break a complicated problem down into bite-sized steps—each of which is likely to match examples in the model’s training data.

The real research behind the wild rumors about OpenAI’s Q* project (2024)
Top Articles
2505 Wedglea Unit#109, Dallas, TX 75211 - MLS 20704363 - Coldwell Banker
SWQ N co*ckrell Hill & I-30, Dallas, TX 75211
What Are Romance Scams and How to Avoid Them
Georgia Vehicle Registration Fees Calculator
Hotels Near 500 W Sunshine St Springfield Mo 65807
Ashlyn Peaks Bio
Merlot Aero Crew Portal
Nm Remote Access
Apply A Mudpack Crossword
Urinevlekken verwijderen: De meest effectieve methoden - Puurlv
Builders Best Do It Center
Radio Aleluya Dialogo Pastoral
The most iconic acting lineages in cinema history
Learn2Serve Tabc Answers
Shannon Dacombe
Www Craigslist Com Phx
History of Osceola County
Nail Salon Goodman Plaza
R Cwbt
Osborn-Checkliste: Ideen finden mit System
Byui Calendar Fall 2023
Unity - Manual: Scene view navigation
Nhl Tankathon Mock Draft
Mychart Anmed Health Login
Milanka Kudel Telegram
Engineering Beauties Chapter 1
Gopher Hockey Forum
Redbox Walmart Near Me
County Cricket Championship, day one - scores, radio commentary & live text
Landing Page Winn Dixie
35 Boba Tea & Rolled Ice Cream Of Wesley Chapel
Haunted Mansion Showtimes Near Cinemark Tinseltown Usa And Imax
Gina's Pizza Port Charlotte Fl
Lil Durk's Brother DThang Killed in Harvey, Illinois, ME Confirms
Ksu Sturgis Library
Craigslist Mexicali Cars And Trucks - By Owner
Yogu Cheshire
Craigslist Freeport Illinois
Despacito Justin Bieber Lyrics
Pulaski County Ky Mugshots Busted Newspaper
Mitchell Kronish Obituary
Vintage Stock Edmond Ok
Brown launches digital hub to expand community, career exploration for students, alumni
Large Pawn Shops Near Me
Noga Funeral Home Obituaries
Jigidi Free Jigsaw
Mega Millions Lottery - Winning Numbers & Results
Electric Toothbrush Feature Crossword
Tamilblasters.wu
Att Corporate Store Location
Craigslist Farm And Garden Missoula
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated:

Views: 5593

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.