Bloomsbury Intelligence & Security Institute (BISI)

View Original

Project Strawberry: How Microsoft and OpenAI Are Transforming AI Reasoning

Shree Priya Thakur | 30 July 2024


Summary

  • OpenAI, with Microsoft, is developing “Strawberry”, a classified project aimed at enhancing AI reasoning skills and achieving Artificial General Intelligence (AGI). 

  • Strawberry aims to enable AI to conduct autonomous deep research, anticipate needs, and perform complex tasks over a timeline. 

  • Strawberry is designed to solve issues current AI models face with common-sense problems and generating nonsensical answers.


OpenAI, in collaboration with Microsoft, is working on a confidential project codenamed “Strawberry”, designed to enhance AI reasoning skills. The ChatGPT maker is working on a next generation “reason” artificial intelligence (AI) model capable of autonomous conduct. The aim of Strawberry is to achieve Artificial General Intelligence (AGI) by enabling AI to independently conduct “deep research” on the internet. A recent Reuters investigation reveals that Project Strawberry, an enhanced version of the erstwhile Q* Project, aims to improve AI’s reasoning abilities and autonomous internet navigation. This would accelerate response accuracy and enable AI to anticipate user needs. By gathering relevant information, “reason” based AI models can plan and execute complex tasks over time, bringing them in greater alignment to human intelligence. This is a substantial advancement from the Q*, which, despite demonstrating impressive capabilities in answering complex science and maths questions (scoring 90%+ on a MATH dataset), lacked the broader reasoning skills targeted by Strawberry. 

OpenAI’s vision with Strawberry is to push the boundaries of AGI, though specifics about the project, such as the large Language Model (LLM), timeline, and potential public access, are tightly controlled. However, Strawberry shows notable advancements over existing bots through: 

  1. Long Horizon Tasks: The AI model focuses on long-term planning and actions over extended periods 

  2. Deep-Research: Strawberry is envisioned for deep, autonomous research, processing and analysing cross-sectional data. 

  3. Post-Training Enhancement: Unlike other models, Strawberry undergoes a “post-training” process, akin to fine-tuning, where it learns from base models after the initial training on generalised datasets, and improves through human feedback on its responses. 

Strawberry aims to solve issues current generative AI models face. While existing LLMs excel at summarising dense texts and writing poetry, they struggle with common-sense problems, such as recognising logical fallacies or playing simple games like Connect Four or tic-tac-toe. When faced with such tasks, the existing LLMs get confused or “hallucinated”, generating incorrect or nonsensical information. Strawberry focuses on improving reasoning by developing a model capable of planning ahead, understanding the world, and solving complex, multi-step problems. This approach is inspired by Stanford’s Self Taught Reasoner (STaR) method, which allows AI models to create their own training data and potentially surpass human intelligence.

The development of Strawberry indicates OpenAI’s commitment to overcoming current AI model limitations. According to internal sources and a Bloomberg report, OpenAI recently demonstrated a research project with claimed human-like reasoning skills, though it is unclear if this is linked to Project Strawberry. By focussing on sophisticated reasoning, Strawberry can tackle more complex tasks and enter the realm of decision making, ultimately raising the question whether artificial intelligence will soon surpass our own?

Hal Gatewood/Unsplash


Forecast

  • Short-term

    • Strawberry is likely a part of OpenAI's five-level roadmap for AI development, which includes Chatbots (ChatGPT) with conversational abilities, Reasoners (Strawberry) for human-like problem-solving, autonomous Agents, Innovators for creating new technologies, and Organisers for completing tasks independently. If successfully launched, Strawberry could mark OpenAI's progress to Level 2 (Reasoners), potentially transforming education and public policy.

  • Medium-term

    • Strawberry brings OpenAI closer to achieving Artificial General Intelligence (AGI), which could impact academia, new industries, personalised healthcare, and governance. OpenAI and similar companies must ensure AGI aligns with human values to prevent harmful outcomes. As AGI approaches superintelligence, the global community needs legislative and practical frameworks to manage its use, especially in defence and other high-risk areas.