Home
Technology
ChatGPT's Mathematical Accuracy Falls To Shocking 2%; Response Quality Deteriorates

ChatGPT's Mathematical Accuracy Falls To Shocking 2%; Response Quality Deteriorates

Jul 25, 2023

In recent times, there has been a growing number of reports and discussions about a decline in the quality of responses from ChatGPT. To investigate this matter, a team of researchers from Stanford and UC Berkeley conducted a study to quantify the extent of this degradation. The study confirmed that the drop in ChatGPT’s quality was indeed real.

ChatGPT's Mathematical Accuracy Falls To Shocking 2%; Response Quality Deteriorates

The research paper titled “How Is ChatGPT’s Behavior Changing Over Time?” was authored by three prominent academics: Matei Zaharia, Lingjiao Chen, and James Zou. Matei Zaharia, who is a Computer Science Professor at UC Berkeley, shared the findings on Twitter, revealing a startling fact that GPT-4’s success rate in solving certain problems fell drastically from 97.6% to 2.4% between March and June.

GPT-4, which was recently released and acclaimed as OpenAI’s most advanced model, had been eagerly anticipated by developers for its potential to power innovative AI products. However, the study’s results showed disappointing performance, especially in handling straightforward queries.

The research team designed tasks to evaluate the quality of responses from the large language models (LLMs) GPT-4 and GPT-3.5. These tasks covered areas such as solving math problems, answering sensitive questions, code generation, and visual reasoning. The chart provided an overview of the performance of both models across their March and June releases in 2023.

The data clearly illustrated that the same LLM service provided different answers over time, showing significant differences in performance within this short period. It remains uncertain how these LLMs are updated and whether changes to improve one aspect of their performance might negatively affect others. Notably, the latest version of GPT-4 performed worse compared to the March version in three testing categories, with only a slight margin of improvement in visual reasoning.

While some may not be concerned about the variable quality in the “same versions” of these LLMs, it is crucial to acknowledge that both GPT-4 and GPT-3.5 have been widely adopted by individual users and businesses due to the popularity of ChatGPT. As such, information generated by these models can significantly impact people’s lives.

The researchers intend to continue assessing GPT versions in a more extended study. They suggest that OpenAI should consider monitoring and publishing regular quality checks for its paying customers. If not, it may be necessary for business or governmental organizations to keep an eye on basic quality metrics for these LLMs to avoid potential commercial and research impacts.

The AI and LLM technology domain has had its share of surprising issues, and with data privacy concerns and other public relations challenges, it currently seems like the “wild west” frontier of connected life and commerce.

chatgpt

Radhika Kajarekar

864 Posts

Auction Of 808 FM Stations Across 284 Cities Will Start Soon For Expanding FM Footprint; Rules Made Easy!

Subscribe Now!

Get latest news and views related to startups, tech and business

iPhone SE 4 To Have Dynamic Island: But iPhone 16e Branding Is Doubtful

Technology

Jan. 22, 2025

iPhone SE 4 To Have Dynamic Island: But iPhone 16e Branding Is Doubtful

OnePlus Open 2 Can Be World's Thinnest Foldable Smartphone (Design Leaked)

Technology

Jan. 22, 2025

OnePlus Open 2 Can Be World's Thinnest Foldable Smartphone (Design Leaked)

US Govt Announces $500 Billion Stargate Project To Create 100,000 Jobs In AI

Technology

Jan. 22, 2025

US Govt Announces $500 Billion Stargate Project To Create 100,000 Jobs In AI

iPhone Production By Tata Reaches Rs 40,000 Crore: 180% Jump In 12 Months!

Technology

Jan. 20, 2025

iPhone Production By Tata Reaches Rs 40,000 Crore: 180% Jump In 12 Months!

Future-Proofing Financial Operations: The Impact of Guru4Invest on Business Sustainability

Technology

Jun. 2, 2023

Future-Proofing Financial Operations: The Impact of Guru4Invest on Business Sustainability

As global markets grow increasingly complex, businesses face significant challenges in maintaining financial stability. Inefficient resource allocation and a lack of timely insights can prevent companies from reaching their full potential. To address these issues, organizations need tools that offer clear direction and practical solutions. Guru4Invest meets these demands by delivering innovative strategies to optimize […]

Technology

Sep. 8, 2022

Samsung Launches The Wall All-In-One and Flip Pro: Is This The Future Of Display Technology?

Samsung has launched The Wall All-In-One – the modular MicroLED it says is revolutionizing the future of display and the Flip Pro, which is an interactive display. Both were unveiled at the InfoComm India 2022 which is India’s Professional AudioVisual (Pro AV) and Systems Integration Technology Exhibition. This took place in Mumbai from September 5-7. […]

Google Street View Launches In India Across These 10 Indians Cities! Plans To Expand To 700,000 Kms, 50 Cities In 2 Years

Technology

Jul. 28, 2022

Google Street View Launches In India Across These 10 Indians Cities! Plans To Expand To 700,000 Kms, 50 Cities In 2 Years

Google’s Street View is finally available in India a decade after it was prevented from capturing data for its Street View services. Second coming Street view offers a 360-degree interactive panorama feature initially for 10 Indian cities with data from local partners Tech Mahindra and Mumbai-based Genesis International. Its entry into India is facilitated by […]

This Electricity-Free Cooler Developed By IIT Researchers Can Replace Air Conditioners! How It Work?

Technology

Jul. 10, 2022

This Electricity-Free Cooler Developed By IIT Researchers Can Replace Air Conditioners! How It Work?

Indian Institute of Technology Guwahati researchers have built a ‘Radiative Cooler’ which does not require electricity to operate. This is an affordable and efficient ‘passive’ radiative cooling system that can serve as an alternative to ACs. The coating material is an electricity-free cooling system that can be applied in the rooftops and functions during both […]

ChatGPT's Mathematical Accuracy Falls To Shocking 2%; Response Quality Deteriorates

Radhika Kajarekar

Auction Of 808 FM Stations Across 284 Cities Will Start Soon For Expanding FM Footprint; Rules Made Easy!

Amazon India Sold 5 Smartphones Every Second During Prime Sale Sales 2023 | Tier 2, 3 Cities Generated Maximum Sales

Subscribe Now!

You Might Also Like

iPhone SE 4 To Have Dynamic Island: But iPhone 16e Branding Is Doubtful

OnePlus Open 2 Can Be World's Thinnest Foldable Smartphone (Design Leaked)

US Govt Announces $500 Billion Stargate Project To Create 100,000 Jobs In AI

iPhone Production By Tata Reaches Rs 40,000 Crore: 180% Jump In 12 Months!

Future-Proofing Financial Operations: The Impact of Guru4Invest on Business Sustainability

Samsung Launches The Wall All-In-One and Flip Pro: Is This The Future Of Display Technology?

Google Street View Launches In India Across These 10 Indians Cities! Plans To Expand To 700,000 Kms, 50 Cities In 2 Years

This Electricity-Free Cooler Developed By IIT Researchers Can Replace Air Conditioners! How It Work?

Recent Posts

Trakin Tech Hindi Becomes 14 Million Strong Family; Armoks Crosses 60 Million Subscriber Base

US-China Trade War Starts As Trump Imposqes 10% Duty For Fentanyl

iPhone SE 4 To Have Dynamic Island: But iPhone 16e Branding Is Doubtful

Trending & Popular

Pune Will Get 4 New Vande Bharat Trains Connecting These Destinations

India's Longest Vande Bharat Covers 994 Kms In 11 Hours

Ratan Tata Building A New City In Tamil Nadu; Will Create More Jobs Than Jamshedpur

Pune Mumbai Expressway Speed Limit Fines Termed As 'Extortion', 'Honey Trap' By Social Media Users

Minimum Employee Pension By Private Companies Can Be Raised By 650% In Budget 2025

It's Official: No 8th Pay Commission For 1 Crore Govt Employees, Pensioners

US Companies Are Avoiding H1B Visa, Hiring Remote Workers Due To Trump's Swearing-in

Infosys Employees Can Get October Salary Hike Letters Starting February

Social Media

Facebook

Twitter

WhatsApp

LinkedIn

YouTube

RSS

Related Videos

Subscribe Now!