Inside the AI experimentation trenches
How companies think about iterations and setbacks in the fast-paced AI race.
• 5 min read
A screenshot buried in the comment section of a Marc Benioff LinkedIn post around a year ago gave Salesforce’s Bernard Slowey a jolt. It showed the then-new Agentforce help portal directing the customer to a Salesforce competitor.
“I was literally like, ‘Oh my god, what has happened here? This is not good. My job is going to be gone tomorrow,’” Slowey, who is—spoiler alert—still Salesforce’s SVP of digital customer success, told us.
You may have heard the now-infamous stat that 95% of enterprise generative AI pilots fail before they reach production. Or maybe you have some thoughts on why that MIT report’s methodology was flawed. In any case, abundant data show that a sizable chunk of AI prototypes don’t go as planned—and ROI is questionable.
We wanted to talk with companies about how they’ve dealt with unexpected AI complications or projects that just didn’t work out—what they learned and how they subsequently recalibrated. Some of them told us that three-plus years of experimentation with unpredictable and fast-changing generative AI have reshaped how they build things and make decisions.
In Salesforce’s case, Slowey traced the issue using an observability feature. The AI had learned from the content in Salesforce’s help portal on migrating from another platform. The team then wrote a strict guardrail to forbid talk about competitors. But that caused more headaches when customers would ask about, for instance, a Microsoft Teams integration.
The “eureka learning moment” came when they decided to adjust Agentforce’s system prompt to treat it more like a digital employee.
“I can remember the words we wrote. We said, ‘You are an employee of Salesforce. You’re a customer service representative of Salesforce. Put the best interest of Salesforce in everything you do.’ And we deleted that guardrail,” Slowey said.
Slide troubles
Sometimes execs have had to be honest about when the technology just isn’t there yet. At PwC, Vikas Agarwal was ready to transform the industry with automated slide decks. But the consultancy’s chief technology and innovation officer found in testing various AI presentation tools that they couldn’t compete with the human creativity of PwC’s analysts, he said.
“Everybody feels like, ‘Oh, with this tool, I’m just gonna, you know, vibe my PowerPoints into existence,” Agarwal said. “And I found that the reality is a lot more stark than that.”
“A few months later I could have rolled it out, and I could have said, ‘Here it is, and it’s transforming consulting,’ and made myself feel good. But the reality of what was happening on the ground when I measured it was not that great.”
Instead, Agarwal said he asked a different question: Why slide decks at all? He pivoted to exploring more interactive formats of presentation screens and documents that had previously been a coding challenge—something AI has proven itself good at.
“If you hold yourself to the biases of the existing process, you may not be rethinking what’s possible with the new tools that are out there today,” Agarwal said.
News built for finance pros
CFO Brew helps finance pros navigate their roles with insights into risk management, compliance, and strategy through our newsletter, virtual events, and digital guides.
LiveRamp CIO Sashi Binani came to a similar conclusion about AI’s slide-making prowess. About a year ago, the marketing tech company had wanted to tap AI to pre-build quarterly business review (QBR) decks. After being underwhelmed with the results, the company decided not to pursue it and leave it to a company like Google to solve, Binani said.
“What we realized was, as much as [the QBR is] a science, it’s actually an art,” Binani said.
Changing processes
In an attempt to make sure that LiveRamp was devoting its AI resources to the right problems, Binani said the company held an “AI week” for employees with classes on building agents and a certification pathway. Employees have since built around 900 agents across the company, with a “concierge” to route tasks to the right ones. Agents that prove most useful will work their way up leaderboards and win internal awards.
“[We will] talk about victories, talk about failures, talk about lessons learned, talk about what is working for the organization,” Binani said.
David Glick, Walmart’s SVP of enterprise business services, said with AI coding, “we can spit out a prototype in less than a week.” But it will take weeks of product requirement research before that happens, and then weeks of auditing, security, and compliance after that before it can be deployed. Glick is hoping that AI will eventually compress that timeline, too.
“We started sort of in the middle with the coding part, and now we’re moving [the] middle out to use AI to support every component of the software development life cycle,” he said.
Glick said AI also allows his team to fix issues and iterate faster, so it isn’t a big deal if an internal “nano agent” isn’t perfect on the first go. “If we get it wrong, it’s OK, because we only spent a few days, rather than spending months on something,” he said.
Assuming the future
Aparna Chennapragada, Microsoft’s chief product officer for AI experiences, said designing successful AI projects can sometimes involve guesswork. She and her team conceive of projects with faith that model performance will improve in the course of their work.
“You can basically build something that just works, and works around all the problems of the models today,” Chennapragada said. “In fact, we’ve done it. Other companies have done it. So I remember two years ago when image generation couldn’t even do the spelling, right? If you had to do image editing in PowerPoint, you had to put a bunch of guardrails. But what happens then is that it’s all wasted effort, because six months later, new models come out and all that stuff goes away.”
Now, Microsoft builds agents that “assume a little bit more capability” than AI models currently have, which means there can be some “rough edges” at first before progress actually gets to that assumed point, Chennapragada said.
News built for finance pros
CFO Brew helps finance pros navigate their roles with insights into risk management, compliance, and strategy through our newsletter, virtual events, and digital guides.