[Home]  [Headlines]  [Latest Articles]  [Latest Comments]  [Post]  [Sign-in]  [Mail]  [Setup]  [Help] 

Status: Not Logged In; Sign In

Doggy Chiroroctor. One Grateful Dog.

Stop eating bread and pasta

Amazon Shares Tumble After White House Calls Tariff-Tracker 'Hostile And Political Act'

Canada elects ANOTHER homosexual Prime Minister... [My title]

Trump billionaire tech ally Peter Thiel sells smart AI war fighting system to NATO

Former U.S. Senator Norm Coleman declares “The masters of the universe are Jews!”

A little girl. Sitting in the middle of the street Her brain can’t process what just happened

THE INVISIBLE PUPPET MASTERS: AI'S DISTURBING NEW ROLE IN SHAPING MINDS

60 Minutes, 0 Credibility: Trumps Lawsuit Exposes Media Deception

School Staff Grew 700% Since 1950

Vance issued an ominous warning about Russia: "The last moment, Charlie"

NATO is practicing the seizure of the Kaliningrad region:

“My Son Mason was Killed by Vaccines”

London Is Losing Its Millionaires

Visualizing AI vs. Human Performance In Technical Tasks

Bezos-Backed Startup Debuts Pickup Truck Reminiscent Of 1980s Toyota Hilux

Tucker Carlson: Catherine Fitts: Bankers vs. the West and the Coming Extinction Event

Billionaires Plan to Survive a Cataclysmic Event

Executive Order requires truck drivers to be proficient in English

Karmelo Anthony Supporters THREATEN Journalist Covering Case! Dox Her Family & Kids!

Australian radio station secretly used an AI host for six months

Delta passengers forced to hold plane together after ceiling collapses mid-flight

6 Days After Celebrating '100% Renewable Power', Nation's Largest Blackout In History

Israel CAUGHT Faking ‘Antisemitic’ Hate Crimes

California homeowner finds a camera that was covertly installed by felons plotting to loot his place.

Biden admin IGNORED 170 tips about the illegal underground nightclub in CO

Old Interview With a Young Elon Musk

Virginia Giuffres Lawyer: She Didnt Commit Suicide The Fight Against Prince Andrew Continues

Col. Doug Macgregor: Who and what was behind the purge of Pentagon officers?

Candace Owens: Mr. and Mr. Macron’s lawyers, please don’t phone me again.


Science/Tech
See other Science/Tech Articles

Title: Visualizing AI vs. Human Performance In Technical Tasks
Source: [None]
URL Source: https://www.zerohedge.com/technolog ... an-performance-technical-tasks
Published: Apr 29, 2025
Author: Tyler Durden
Post Date: 2025-04-29 07:05:05 by Horse
Keywords: None
Views: 4

The gap between human and machine reasoning is narrowing...and fast.

Over the past year, AI systems have continued to see rapid advancements, surpassing human performance in technical tasks where they previously fell short, such as advanced math and visual reasoning.

This graphic, via Visual Capitalist's Kayla Zhu, visualizes AI systems’ performance relative to human baselines for eight AI benchmarks measuring tasks including:

Image classification

Visual reasoning

Medium-level reading comprehension

English language understanding

Multitask language understanding

Competition-level mathematics

PhD-level science questions

Multimodal understanding and reasoning

This visualization is part of Visual Capitalist’s AI Week, sponsored by Terzo. Data comes from the Stanford University 2025 AI Index Report.

An AI benchmark is a standardized test used to evaluate the performance and capabilities of AI systems on specific tasks.

AI Models Are Surpassing Humans in Technical Tasks Below, we show how AI models have performed relative to the human baseline in various technical tasks in recent years.

Year Perfomance relative to the human baseline (100%) Task

2023 47.78% PhD-level science questions

2023 93.67% Competition-level mathematics

2023 96.21% Multitask language understanding

2023 71.91% Multimodal understanding and reasoning

2024 108.00% PhD-level science questions

2024 108.78% Competition-level mathematics

2024 102.78% Multitask language understanding

2024 94.67% Multimodal understanding and reasoning

2024 101.78% English language understanding

From ChatGPT to Gemini, many of the world’s leading AI models are surpassing the human baseline in a range of technical tasks.

The only task where AI systems still haven’t caught up to humans is multimodal understanding and reasoning, which involves processing and reasoning across multiple formats and disciplines, such as images, charts, and diagrams.

However, the gap is closing quickly.

In 2024, OpenAI’s o1 model scored 78.2% on MMMU, a benchmark that evaluates models on multi-discipline tasks demanding college-level subject knowledge.

This was just 4.4 percentage points below the human benchmark of 82.6%. The o1 model also has one of the lowest hallucination rates out of all AI models.

This was major jump from the end of 2023, where Google Gemini scored just 59.4%, highlighting the rapid improvement of AI performance in these technical tasks.

To dive into all the AI Week content, visit our AI content hub, brought to you by Terzo.

To learn more about the global AI industry, check out this graphic that visualizes which countries are winning the AI patent race.


Poster Comment:

X AI is the newest entry into AI. It is less biased politically and is superior to all others except Chat GPT which is much older. X AI is getting better daily will soon be the best. (1 image)

Post Comment   Private Reply   Ignore Thread  



[Home]  [Headlines]  [Latest Articles]  [Latest Comments]  [Post]  [Sign-in]  [Mail]  [Setup]  [Help]