Witness the future of finance at FloQast TakeControl 2025. Join leaders in finance, accounting, audit, risk, and compliance Sept. 17–18 as they meet to learn from experts. Access cutting-edge AI strategies and practical solutions needed to stay ahead in an ever-changing industry. Join the virtual conference.
On August 7, OpenAI’s latest LLM, GPT-5, came out, and landed with more of a whimper than a bang.
Its previous models, GPT-3 and GPT-4, each improved dramatically upon their predecessors, raising hopes that GPT-5 would prove a similar advance. Instead, GPT-5 seemed to be more of an incremental improvement. It still hallucinated, and it still failed basic reasoning tasks: The model that Sam Altman likened to having a “team of Ph.D.-level experts in your pocket” couldn’t tell you the number of b’s in “blueberry.”
The GPT-5 fizzle was the start of a rough month for generative AI’s reputation. In a New Yorker article, Cal Newport wondered whether the technology was ever going to get that much better. The New York Times reported on McKinsey research showing that ~80% of companies were seeing “no significant bottom-line impact” from GenAI, despite having spent billions on it. And a paper by MIT’s NANDA showed that 95% of companies’ pilots of enterprise AI tools fail.
All of that is hard to square with the bold claims GenAI proponents made early on: that the technology would be a “revolution” on par with the advent of the internet, that it was a step that would lead humanity to artificial general intelligence (and quickly!), and that it would eliminate a wide swath of white-collar jobs. Almost three years after the public release of ChatGPT 2, where is the technology really headed?
I’m in my productivity era: GenAI sentiment has had about as many eras as Taylor Swift. There was the “free your workforce up for higher-value tasks” phase, followed by the “replace human workers with bots” cycle. Now, we may be entering the decidedly less-sexy era of modest efficiency gains.
For instance, JPMorgan’s employees use an AI assistant to help them with tasks such as research and report writing, the New York Times reported. About half of the 200,000 staff who use it save up to four hours a week, the bank said. As CFO Brew previously reported, Salesforce and PayPal’s COFOs both said that AI has let them “reallocate” staff away from redundant tasks.
Outside the office, AI can help with productivity and even safety. Technicians at Johnson Controls were able to shave 10–15 minutes off hour-long calls by using an AI that suggests repair options. Metals and mining company Phoenix Global installed AI dashcams and apps on tablets that can coach drivers and alert them to drowsiness or inattention. The system cut unsafe driving incidents by 40% in just its first month in operation. Given the high cost of an accident, the system could potentially save the company millions, CFO Jeff Suellentrop previously told CFO Brew.
What about agents?: Much of the buzz around AI this year has centered around agents—programs that can perform tasks autonomously—such as sending reminder emails, generating PowerPoints and Excel spreadsheets, categorizing expenses, or booking flights. Theoretically, one day, agents could be linked together to complete entire workflows.
But don’t clean out your desk just yet. Developers give agents mixed reviews. Researchers at Carnegie Mellon University tested agents’ performance in a model company and found that even the best of them independently completed office tasks just 30.3% of the time. (The worst had a 99.9% failure rate.) The agents could be stymied by something as simple as pop-up ads on a website. A Salesforce study similarly found that agents only have a 35% success rate in “multi-turn interactions,” in which they respond to more than one prompt.
Agents aimed at consumers are likewise still pretty wonky. A ChatGPT agent took almost an hour to order cupcakes online, and 50 minutes to curate a list of five Japanese-style lamps sold on Etsy.
That’s not even getting into the security risks. The Salesforce study noted that agents “demonstrate near-zero inherent confidentiality awareness.”
Is AI really taking jobs?: Agents might become a threat to jobs some day (though there’s always that pesky compounding error rate to contend with), but what’s happening right now?
There, the picture is murkier. Individual companies have made headlines with announcements about replacing people with AI, and large employers like Amazon and JPMorgan have claimed AI-related layoffs are coming. A study by Revelio Labs found that job postings for roles with “high exposure” to AI disruption, such as database administrators, data engineers, and IT professionals, dropped 31% over the past three years, compared with a 25% drop for jobs with “low exposure,” such as restaurant managers and mechanics.
But it’s hard to disentangle the effects of AI on the job market from other macroeconomic factors, like inflation, tariffs, reversing pandemic-fueled hiring sprees, and the changes to the tax treatment of R&D that have led to around half a million tech layoffs since 2023. Tech and media/communications are the only two sectors that have seen “clear signs of AI disruption,” the MIT report said. The “jobs most impacted were already low-priority or outsourced,” Aditya Challapally, research contributor for NANDA, told Axios.
Instead of laying people off, companies seem to be holding off on backfilling roles to see whether they can be performed by AI. More of the gains companies are seeing come from “replacing BPOs [business process outsourcing] and external agencies, not cutting internal staff,” the MIT report stated. Those are changes, to be sure, but hardly a “white-collar bloodbath,” as one Axios headline said.
Even Altman, who once asserted that AI would “create phenomenal wealth” and drive the value of labor “toward zero,” is walking back some of his claims. On a recent episode of CNBC’s Squawk Box, he said AGI is “not a super useful term.”