The story there is very different than what's in the article.
Some infos:
- 50% of the budgets (the one that fails) went to marketing and sales
- the authors still see that AI would offer automation equaling $2.3 trillion in labor value affecting 39 million positions
- top barriers for failure is Unwillingness to adopt new tools, Lack of executive sponsorship
Lots of people here are jumping to conclusions. AI does not work. I don't think that's what the report says.
didibus 1 hours ago [-]
> affecting 39 million positions
Wow, that is crazy. There's 163 million working Americans, that's close to a quarter of the workforce is at risk.
baal80spam 1 hours ago [-]
> Lots of people here are jumping to conclusions. AI does not work. I don't think that's what the report says.
Well...
"It is difficult to get a man to understand something when his salary depends upon his not understanding it"
jawns 2 hours ago [-]
Full disclosure: I'm currently in a leadership role on an AI engineering team, so it's in my best interest for AI to be perceived as driving value.
Here's a relatively straightforward application of AI that is set to save my company millions of dollars annually.
We operate large call centers, and agents were previously spending 3-5 minutes after each call writing manual summaries of the calls.
We recently switched to using AI to transcribe and write these summaries. Not only are the summaries better than those produced by our human agents, they also free up the human agents to do higher-value work.
It's not sexy. It's not going to replace anyone's job. But it's a huge, measurable efficiency gain.
dsr_ 2 hours ago [-]
Pro-tip: don't write the summary at all until you need it for evidence. Store the call audio at 24Kb/s Opus - that's 180KB per minute. After a year or whatever, delete the oldest audio.
There, I've saved you more millions.
doorhammer 2 hours ago [-]
Sentiment analysis, nuanced categorization by issue, detecting new issues, tracking trends, etc, are the bread and butter of any data team at a f500 call center.
I'm not going to say every project born out of that data makes good business sense (big enough companies have fluff everywhere), but ime anyway, projects grounded to that kind of data are typically some of the most straight-forward to concretely tie to a dollar value outcome.
la_fayette 1 hours ago [-]
Yes that sound like important and useful use cases. However, these are solved by boring old school ML models since years...
williamdclt 1 hours ago [-]
I think what they're saying is that you need the summaries to do these things
esafak 1 hours ago [-]
It's easier and simpler to use an LLM service than to maintain those ad hoc models. Many replaced their old NLP pipelines with LLMs.
prashantsengar 42 minutes ago [-]
The place I work at, we replaced our old NLP pipelines with LLMs because they are easier to maintain and reach the same level of accuracy with much less work.
We are not running a call centre ourselves but we are a SaaS offering the services for call centre data analysis.
doorhammer 1 hours ago [-]
So, I wouldn't be surprised if someone in charge of a QA/ops department chose LLMs over similarly effective existing ML models in part because the AI hype is hitting so hard right now.
Two things _would_ surprise me, though:
- That they'd integrate it into any meaningful process without having done actual analysis of the LLM based perf vs their existing tech
- That they'd integrate the LLM into a core process their department is judged on knowing it was substantially worse when they could find a less impactful place to sneak it in
I'm not saying those are impossible realities. I've certainly known call center senior management to make more hairbrained decisions than that, but barring more insight I personally default to assuming OP isn't among the hairbrained.
shortrounddev2 45 minutes ago [-]
My company gets a bunch of product listings from our clients and we try to group them together (so that if you search for a product name you can see all the retailers who are selling that product). Since there arent reliable UPCs for the kinds of products we work with, we need to generate embeddings (vectors) for the products by their name/brand/category and do a nearest-neighbor search. This problem has many many many "old school" ML solutions to it, and when i was asked to design this system I came up with a few implementations and proposed them.
Instead of doing any of those (we have the infrastructure to do it) we are paying OpenAI for their embeddings APIs. Perhaps openAI is just doing old school ML under the hood but there is definitely an instinct among product managers to reach for shiny tools from shiny companies instead of considering more conservative options
aaomidi 6 minutes ago [-]
Sentiment analysis was not solved and companies were paying analyst firms shit tons of money to do that for them manually.
adrr 46 minutes ago [-]
Those have been done for 10+ years. We were running sentiment analysis on email support to determine prioritization back in 2013. Also ran bayesian categorization to offer support reps quick responses/actions. Don't need expensive LLMs it.
doorhammer 21 minutes ago [-]
Yeah, I was a QA data analyst supporting three multi-thousand agent call-centers for an F500 in 2012 and we were using phoneme matching for transcript categorization. It was definitely good enough for pretty nuanced analysis.
I'm not saying any given department should, by some objective measure, switch to LLMs and I actually default to a certain level of skepticism whenever my department talks about applications.
I'm just saying I can imagine plausible realities where an intelligent and competent person would choose to switch toward using LLMs in a call center context.
There are also a ton of plausible realities where someone is just riding the hype train gunning for the next promotion.
I think it's useful to talk about alternate strategies and how they might compare, but I'm personally just defaulting to assuming the OP made a reasonable decision and didn't want to write a novel to justify it (a trait I don't suffer from, apparently), vs assuming they just have no idea what they're doing.
Everyone is free to decide which assumed reality they want to respond to. I just have a different default.
andix 2 hours ago [-]
Imagine a follow-up call of a customer. They are referring to earlier calls and the call center agents needs to check what it was about. So they can skim/read the transcripts while talking to the customer. I guess it's really hard to listen to transcripts while you're on the phone.
ethagknight 1 hours ago [-]
Im imagining my actual experience of being transferred for the 3rd or 4th time, repeating my name and address for the 3rd or 4th time, restating my problem for the 3rd or 4th time... feels like theres an implementation problem, not a technological problem.
Quick and accurate routing and triage of inbound calls may be more fruitful and far easier than summarizing hundreds of hours of "ok now plug the router back into the wall." Im imagining AI identifying a specific technical problem that sounds a lot like a problem that a specific technician successfully solved previously.
0x457 1 hours ago [-]
Also waiting music being interrupted every minute to tell:
1) my call is very important to them (it's not)
2) listen carefully because options changed (when? 5 years ago?)
3) they have a website where I can do things (you can't, otherwise why would I call?)
4) please stay at the end of call to give them feedback (sure, I will waste more of my time)
dsr_ 1 hours ago [-]
That would be awesome!
But in fact, customer call centers tend not to be able to even know that you called in yesterday, three days ago and last week.
This is why email-ticketing call centers are vastly superior.
Jolter 1 hours ago [-]
Perhaps doing this suggested auto-summarizing would be what finally solves that problem?
josefx 47 minutes ago [-]
Is doing that going to be cheaper than not doing it?
54 minutes ago [-]
tomwheeler 59 minutes ago [-]
> But in fact, customer call centers tend not to be able to even know that you called in yesterday, three days ago and last week.
Nor what you told the person you talked to three minutes earlier, during the same call, before they transferred you to someone else. Because their performance is measured on how quickly they can get rid of you.
fifilura 1 hours ago [-]
I am sorry about your bad experience. Maybe the ones you called did not have AI transcribed summaries and were not managed by GP?
ssharp 1 hours ago [-]
I've always guessed that they are able to tell when you called/what you called about, but they simply don't give that level of information to their frontline folks.
Imustaskforhelp 49 minutes ago [-]
It might be because its in their interests to do so.
It is our problem that needs fixing, so we can just wait untill either they redirect us to the right person with the right knowledge who might be one of the higher ups in the call centers.
Or we just quit the call. Either way, it doesn't matter to the company.
Plus points that they don't have to teach the frontline customer service more details too and it could be easier for them to onboard new people / fire old employees.
Also they would have to pay less if they require very low specifications.
man I remember the is 0.001 cent = 0.001 $ video /meme of verizon
Still makes more sense to do the transcription an analysis lazily rather than ahead of time (assuming you can do it relatively quickly). If that person never calls in again the transcription was a waste of money.
alooPotato 2 hours ago [-]
you want to be able to search over summaries so you need to generate them right away
deadbabe 1 hours ago [-]
Do you want to search summaries, or do you want to save millions of dollars per year?
tene80i 51 minutes ago [-]
Product teams analyse call summaries at scale to guide the roadmap to reduce future calls. It’s not just about case management.
morkalork 1 hours ago [-]
I can assure you that people care very much about searching and mining calls, especially for compliance and QA reasons.
deadbabe 1 hours ago [-]
What’s the ROI?
morkalork 1 hours ago [-]
Transcription cost is a race to the bottom because there's so many vendors competing, same with embeddings. It's positive. Gets better every year.
krainboltgreene 2 hours ago [-]
Pro-tip: You won't ever do that.
ch4s3 2 hours ago [-]
I would imagine OP is probably mining service call summaries to find common service issues, or at least that's what I would do.
krainboltgreene 4 minutes ago [-]
That's what everyone says they'll do and then it never gets touched again.
alooPotato 2 hours ago [-]
we do
ninininino 2 hours ago [-]
Advanced organizations (think not startups, but companies that have had years of decades of profit in the public market) might have solved all the low-hanging fruit problems and have staff doing things like automated quality audits (search summaries for swearing, abusive language, etc).
krainboltgreene 3 minutes ago [-]
I've worked at both. It is extremely rare that anyone ever does it.
morkalork 1 hours ago [-]
And you could save a bunch of money by replacing the staff that do that with LLMs!
anoojb 2 hours ago [-]
Also entrenches plausible deniability and makes legal contests way more cumbersome for plantiffs to resolve.
1 hours ago [-]
1 hours ago [-]
paulddraper 2 hours ago [-]
This works unless you want to automate something with the transcripts, stats, feedback.
Spivak 2 hours ago [-]
Why wouldn't it, once you actually have that project you have the raw audio to generate the transcripts. Only spend the money at the last second when you know you need it.
Edit: Tell me more how preemptively spending five figures to transcribe and summarize calls in case you might want to do some "data engineering" on it later is a sound business decision. What if the model is cheaper down the road? YAGNI.
kenjackson 1 hours ago [-]
This is the bread and butter of call centers and the companies that use them. The transcripts and summaries are used from everything from product improvement to agent assessment. This data is used continuously. Its not like they use this transcript for the one rare time someone sues because they claim an agent lied. That rarely happens.
thfuran 1 hours ago [-]
A company that could save millions by not having staff write up their own call notes almost surely is already doing that.
sillyfluke 1 hours ago [-]
You also will have saved them all the cost of the AI summaries that are incorrect as well.
The parent states:
>Not only are the summaries better than those produced by our human agents...
Now, since they have not mentioned what it took to actually verify that the AI summaries were in fact better than the human agents, I'm sceptical they did the necessary due dillengence.
Why do I think this? Because I have actually tried to do such a verification. In order to verify that the AI summary is actually correct you have to engage in the incredibly tedious task of listening to original recording literally second by second and make sure that what is said does not conflict with the AI summary in question. Not only did the AI summary fail at this test, it failed in the first recording I tested.
The AI summary stated that "Feature x was going to be in Release 3, not 4" whereas the in the recording it is stated that the feature will be in Release 4 not 3, literally the opposite of what the AI said.
I'm sorry but the fact that the AI summary is nicely formatted and has not missed a major topic of conversation means fuck all if the details that are are discussed are spectacularly wrong from a decision tracking perspective, as in literally the opposite of what is stated.
And I know "why" the Ai summary fucked up, because in that instance the topic of conversation was about how there was some confusion about which release that feature was going to be in, that's why the issue was a major item of the meeting agenda in the first place. Predicably, the AI failed to follow the convoluted discussion and "came to" the opposite conclusion.
In short, no fucking thanks.
doorhammer 1 hours ago [-]
Again, not the OP, so I can't speak to exactly their use-case, but the vast majority of call center calls fall into really clear buckets.
To give you an idea: Phonetic transcription was the "state of the art" when I was a QA analyst. It broke call transcripts apart into a stream of phonemes and when you did a search, it would similarly convert your search into a string of phonemes, then look for a match. As you can imagine, this is pretty error prone and you have to get a little clever with it, but realistically, it was more than good enough for the scale we operated at.
If it were an ecom site you'd already know the categories of calls you're interested in because you've been doing that tracking manually for years. Maybe something like "late delivery", "broken item", "unexpected out of stock", "missing pieces", etc.
Basically, you'd have a lot of known context to anchor the llms analysis, which would (probably) cover the vast majority of your calls, leaving you freed up to interact with outliers more directly.
At work as a software dev, having an LLM summarize a meeting incorrectly can be really really bad, so I appreciate the point you're making, but at a call center for an f500 company you're looking for trends and you're aware of your false positive/negative rates. Realistically, those can be relatively high and still provide a lot of value.
Also, if it's a really large company, they almost certainly had someone validate the calls, second-by-second, against the summaries (I know because that was my job for a period of time). That's a minimum bar for _any_ call analysis software so you can justify the spend. Sure, it's possible that was hand-waved, but as the person responsible for the outcome of the new summarization technique with LLMs, you'd be really screwing yourself to handwave a product that made you measurably less effective. There are better ways to integrate the AI hype train into a QA department than replacing the foundation of your analysis, if that's all you're trying to do.
Imustaskforhelp 35 minutes ago [-]
I genuinely don't think that the GP is actually making someone actually listen to the transcription and summary and check if the summary is wrong.
I almost have this gut feeling that its the case (I may be wrong though)
Like imagine this, if the agent could just spend 3 minutes writing a summary, why would you use AI to create a summary and then have some other person listen to the whole audio recording and check if the summary is right
like it would take an agent 3 minutes out of lets say a 1 hour long conversation / (call?)
on the other hand you have someone listen to 1 hour whole recording and then check the summary?
that's now 1 hour compared to 3 minutes
Nah, I don't think so.
Even if we assume that multiple agents are contacted in the same call, they can all simply write the summary of what they did and to whom they redirected and just follow that line of summaries.
And after this, I think that your summary of seeing that they are really screwing away is accurately true.
Kinda funny how the gp comment was the first thing that I saw in this post and how even I was kinda convinced that they are one of the more smarter ones integrating AI but your comment made me come to realization of them actually just screwing themselves.
Imagine the irony, that a post about how AI companies are screwing themselves by burning a lot of money and then the people using them don't get any value out of it.
And then the one on Hn that sounded like it finally made sense for them is also not making sense... and they are screwing over themselves.
The irony is just ridiculous. So funny it made me giggle
doorhammer 3 minutes ago [-]
They might not be, and their use-case might not be one I agree with. I can just imagine a plausible reality where they made a reasonable decision given the incentives and constraints, and I default to that.
I'm basically inferring how this would go down in the context I worked under, not the GP, because I don't know the details of their real context.
I think I'm seeing where I'm not being as clear as I could, though.
I'm talking about the lifecycle of a methodology for categorizing calls, regardless of whether or not it's a human categorizing them or a machine.
If your call center agent is writing summaries and categorizing their own calls, you still typically have a QA department of humans that listen to a random sample of full calls for any given agent on a schedule to verify that your human classifiers are accurately tagging calls. The QA agents will typically listen to them at like 4x speed or more, but mostly they're just sampling and validating the sample.
The same goes for _any_ automated process you want to apply at scale. You run it in parallel to your existing methodology and you randomly sample classified calls, verifying that the results were correct and you _also_ compare the overall results of the new method to the existing one, because you know how accurate the existing method is.
But you don't do that for _every_ call.
You find a new methodology you think is worth trying and you trial it to validate the results. You compare the cost and accuracy of that method against the cost and accuracy of the old one. And you absolutely would often have a real human listen to full calls, just not _all_ of them.
In that respect, LLMs aren't particularly special. They're just a function that takes a call and returns some categories and metadata. You compare that to the output of your existing function.
But it's all part of the: New tech consideration? -> Set up conditions to validate quantitatively -> run trials -> measure -> compare -> decide
Then on a schedule you go back and do another analysis to make sure your methodology is still providing the accuracy you need it to, even if you haven't change anything
sillyfluke 38 minutes ago [-]
Thanks for the detailed domain-specific explanation, if we assume that some whale clients of the company will end up in the call center is it not more probable that more competent human agents will be responsible for the call, whereas it's pretty much the same AI agent adressing the whale client as the regular customers in the alternative scenario?
roywiggins 1 hours ago [-]
In the context of call centers in particular I actually can believe that a moderately inaccurate AI model could be better on average than harried humans writing a summary after the call. Could a human do better carefully working off a recording, absolutely, but that's not what needs to be compared against.
It just has to be as good as a call center worker with 3-5 minutes working off their own memory of the call, not as good as the ground truth of the call. It's probably going to make weirder mistakes when it makes them though.
sillyfluke 1 hours ago [-]
>in the context of call centers in particular I actually can believe that a moderately inaccurate AI model could be better on average than harried humans
You're free to believe that of course, but you're assuming the point that has to be proven. Not all fuck ups are equal. Missing information is one thing, but writing literally opposite of what is said is way higher on the fuck up list. A human agent would be achieving an impressive level of incompetence if they kept on repeating such a mistake, and would definately have been jettisoned from the task after at most three strikes (assuming someone notices). But firing a specific AI agent that repeats such mistakes is out of the question for some reason.
Feel free to expand on why no amount of mistakes in AI summaries will outweigh the benefits in call centers.
trenchpilgrim 1 hours ago [-]
Especially humans whose jobs are performance-graded on how quickly they can start talking to the next customer.
Imustaskforhelp 29 minutes ago [-]
Yeah Maybe that's fair in the current world we live in.
But the solution isn't to use AI instead of not trusting the agents / customer service rep because their performance is graded on how quickly they can start talking to next
The solution is to change the economics in the way that the workers are incentivized to write good summaries, maybe paying them more and not grading them in such a way will help.
I am imagining some company saying AI is good enough because they themselves are using the wrong grading technique and AI is best option in that. SO in that sense, AI just benchmarked maxxed in that if that makes sense. Man, I am not even kidding but I sometimes wonder how economies of scale can work so functionally different from common sense. Like it doesn't make sense at this point.
shafyy 60 minutes ago [-]
Well, in my own experience, the LLMs that summarize video meetings at work are not at all 100% accurate. The issue is if you have not participated in the call, you can't say which part is accurate and which is not. Therefore, they are utterly useless to me.
FirmwareBurner 1 hours ago [-]
>Store the call audio at 24Kb/s Opus - that's 180KB per minute
Why OPUS though? There's dedicated audio codecs in the VoiP/telecom industry that are specifically designed for the best size/quality for voice call encoding.
pipo234 1 hours ago [-]
Opus is one of those codecs.
Older codecs like g711 have better latency and steady bitrate, but they compress terribly. (Essentially just bandwidth and amplitude remapping).
Opus is great for a lot of things and realtime speech over sip or webrtc is just one.
46 minutes ago [-]
andrepd 1 hours ago [-]
Opus pretty much blows all those codecs out of the water, in every conceivable metric. It's actually pretty impressive that a media codec is able to universally exceed (or match) every previous one in every axis.
Still, it's based on ideas from those earlier codecs of course :)
lotsofpulp 2 hours ago [-]
The summaries can help automate performance evaluation. If the employee disputes it, I imagine they pull up the audio to confirm.
Imustaskforhelp 18 minutes ago [-]
the amount of false positives coming from wrongful AI summaries plus having to pull up the audio to confirm is so much more hassle than not using AI and evaluating on some different metric at the first place.
Seriously not kidding but the more I read these comments, the more I become horrified realizing wtf,The only reason I can think of integrating AI is because you wish to integrate AI. Nothing wrong with that, But unless proven otherwise through some benchmarks there is no way to justify AI.
So its like an experiment, they use AI and if it works/ saves time, great
If not, then time to roll it.
But we do need to think about experiments logically and the way I am approaching it, its maybe good considering what customer service is now but man that's such a low standard that as customers we shouldn't really stand it. Call centres need to improve period. AI can't fix it. Its like man, we can do anything to save some $ for the shareholders.
Only to then "invest" it proudly into AI so that they can say they have integrated AI and so they can have their valuations increased since VC's / stock market reacts differently to the sticker known as AI
man.. so saying that you use AI, should be a negative indicator instead of a positive one in the market and the whole bubble is gonna come crashing down when people realize it.
It physically hurts me now thinking about it once again. This loop of making humans bad for money, using that money for inferior product, using that inferior product only because you want AI sticker, because shareholders want valuation increase and the company is willing to do this all because they feel/ are rewarded for this by people who will buy anything AI related thinking its gold or maybe that more people will buy it from them at an even higher evaluation because AI sticker and so on..
Almost sounds like a pyramid.
smohare 2 hours ago [-]
[dead]
jordanb 2 hours ago [-]
We use Google meet and it has Gemini transcriptions of our meetings.
They are hilariously inaccurate. They confuse who said what. They often invert the meaning "Joe said we should go with approach x" where Joe actually said we should not do X. It also lacks context causing it to "mishear" all of our internal jargon to "shit my iPhone said" levels.
nostrademons 20 minutes ago [-]
I also use Gemini notes for all my meetings and find them quite helpful. The key insight is: they don’t have to be particularly accurate. Their primary purpose is to remind me (or the other participants) of what was discussed, what considerations were brought up, and what the eventual decision was. If it inverts the conclusion and forgets a “not”, we’re going to catch that, because we were all in the meeting too. It’s their to jog our memory of what was said, because it’s much easier to recognize correct information than recall it, it’s not the authoritative source of truth on the meeting.
This gets to a common misconception when it comes to GenAI uses: it functions best as “augmented intelligence” rather than “artificial intelligence”. Meaning that it’s at its best when there’s still a human in the loop and the AI supplements the parts the person are bad at rather than replacing the person entirely. We see this with coding, where AI is very good at writing scaffolding, large-scale refactoring, picking decent libraries, reading API docs and generating code that calls it appropriately, etc but still needs a human to give it very specific directions for anything subtle, and someone to review carefully for bugs and security holes.
rowanseymour 1 hours ago [-]
Same here. It's frustrating that it doesn't seem to have contextual awareness of who we are and the things we work on so things like names of our products, names of big clients, that we use repeatedly in meetings, are often butchered.
sigmoid10 41 minutes ago [-]
That's the difference between having real AI guys and your average linkedIn "AI guys." The other post is a perfect example for a case where you could take a large but still manageable, cutting-edge transcription model like Whisper and fine-tune it using existing hand made transcriptions as ground truth. A match made in heaven for AI engineers. Of course this is going to work way, way better for specific corporate settings than slapping a random closed source general purpose model like Gemini on your task and hoping for the best, just because it achieves X% on random benchmark Y.
thisisit 1 hours ago [-]
I found that if you have people with accents and they emphasize certain words then it becomes very difficult to read. One example, I find is "th" is often D because how people pronounce it. Apart from that it is a hit or miss.
ricardonunez 1 hours ago [-]
I don’t know how it can confuse because input on mic is relatively straight forward to get. I use fathom and others and they are accurate, better than manual taken. Interesting take, that I don’t memorize 100% on the calls anymore since I rely on note takers, I only remember the major points but when I read the notes, everything comes clear.
orphea 1 hours ago [-]
Oh, that's what happening. I thought my English is just terrible :(
vasco 2 hours ago [-]
I wonder if the human agents agree the AI summaries are better than their summaries. I was nodding as I read and then told myself "yeah but it wouldn't be able to summarize the meetings I have", so I wonder if this only works in 3rd person.
mbStavola 2 hours ago [-]
Part of me also wonders if people may agree that its better simply because they don't actually have to do the summarization anymore. Even if it is worse by some %, that is an annoying task you are no longer responsible for; if anything goes wrong down the line, "ah the AI must've screwed up" is your way out.
roflc0ptic 2 hours ago [-]
I’m inclined to believe that call center employees don’t have a lot of incentive to do a good job/care, so a lossy AI could quite plausibly be higher quality than a human
latexr 1 hours ago [-]
For many years now, every time I have to talk with someone on a call centre there has been a survey at the end with at least two questions:
1. Would you recommend us?
2. Was the agent helpful?
I have a friend who used to work at a call centre and would routinely get the lowest marks on the first item and the highest on the second. I do that when the company has been shitty but I understand the person on the line really made an effort to help.
Obviously, those ratings go back to the supervisor and matter for your performance reviews, which can make all the difference between getting a raise or being fired. If anything, call centre employees have a lot of incentive to do a good job if they have any intention of keeping it, because everything they do with a customer is recorded and scrutinised.
freehorse 1 hours ago [-]
Also it should be easy to correct some obvious mistakes in less convoluted discussions. Also, prob a support call is less complex than eg a group meeting by many aspects, and with a prob larger margin of acceptable errors.
evereverever 1 hours ago [-]
That re-synthesis of information is incredibly valuable to storing it in your own memory.
Of course, we can just rely on knowing nothing just to look things up, but I want more for thinking peoples.
jcims 2 hours ago [-]
I built a little solution to record and transcribe all of my own meetings. I have many meetings (30hr week+) and I can't keep pace with adequate note-taking while participating in them all.
I'm finding that the summarization of individual meetings very useful, I'm also finding that the ability to send in transcripts across meetings, departments, initiatives whatever to be very effective at surfacing subtexts and common pain points much more effectively than I can.
I'm also using it to look at my own participation in meetings to help me see how I interact with others a (little) bit more objectively and it has helped me find ways to improve. (I don't take its advice directly lol, just think about observations and determine myself if it's something that's important and worth thinking about)
mjcohen 42 minutes ago [-]
Make sure that is legal where you are and, if needed, you have their permission.
dymk 2 hours ago [-]
Have you tried having it summarize the meetings you have?
kenjackson 1 hours ago [-]
AI definitely summarizes meetings better than me and _almost_ anyone else I've seen do it (there is one exception -- one guy was a meeting note taker god. He was so good that he set up a mailing list because so many people wanted to read his meeting notes.) I could probably do better than AI if I really tried, but I've only ever done that a few times.
1 hours ago [-]
doubled112 2 hours ago [-]
At work we've tried AI summaries for meetings, but we spent so much time fixing those summaries that we started writing our own again.
Is there some training you applied or something specific to your use case that makes it work for you?
nsxwolf 2 hours ago [-]
We stopped after it kept transcribing a particular phrase of domain jargon as “child p*rn”, again and again.
cube00 2 hours ago [-]
Unless a case goes down the legal road, nobody is ever bothering to read old call summaries in a giant call center.
When was the last time you called a large company and the person answering was already across all the past history without you giving them a specific case number?
doubled112 2 hours ago [-]
Does an AI summary hold up in court? Or would you still need to review a transcript or recording anyway?
cube00 2 hours ago [-]
You can store low quality audio cheaply on cold storage so I suspect that's the real legal record if it got that far.
shawabawa3 2 hours ago [-]
My guess is that the summaries are never actually read, so accuracy doesn't actually matter and the AI could equally be replaced with /dev/null
mrweasel 52 minutes ago [-]
We tried Otter.ai, someone complained and asked: "Could you f-ing not? I don't trust them" and now Otter is accused of training their models on recorded meetings without permission. Yeah, I don't even care if it works, I don't trust any of these companies.
pedrocr 2 hours ago [-]
> agents were previously spending 3-5 minutes after each call writing manual summaries of the calls
Why were they doing this at all? It may not be what is happening in this specific case but a lot of the AI business cases I've seen are good automations of useless things. Which makes sense because if you're automating a report that no one reads the quality of the output is not a problem and it doesn't matter if the AI gets things wrong.
In operations optimization there's a saying to not go about automating waste, cut it out instead. A lot of AI I suspect is being used to paper over wasteful organization of labor. Which is fine if it turns out we just aren't able to do those optimizations anyway.
nulbyte 2 hours ago [-]
As a customer of many companies who has also worked in call centers, I can't tell you how frustrating it is when I, as a customer, have to call back and the person I speak with has no record or an insufficient record of my last call. This has required me to repeat myself, resend emails, and wait all over again.
It was equally frustrating when I, as a call center worker, had to ask the custmer to tell me what should already have been noted. This has required me to apologize and to do someone else's work in addition to my own.
Summarizing calls is not a waste, it's just good business.
recallingmemory 2 hours ago [-]
So long as precision isn't important, I suppose. Hallucination within summaries is the issue I keep running into which prevents me from incorporating it into any of our systems.
thrown-0825 2 hours ago [-]
I have seen ai summaries of calls get people into trouble because the ai hallucinated prices and features that didn't exist
Shank 2 hours ago [-]
Who reads the summaries? Are they even useful to begin with? Or did this just save everyone 3-5 minutes of meaningless work?
doorhammer 2 hours ago [-]
Not the op, but I did work supporting three massive call centers for an f500 ecom.
It's 100% plausible it's busy work but it could also be for:
- Categorizing calls into broad buckets to see which issues are trending
- Sentiment analysis
- Identifying surges of some novel/unique issue
- Categorizing calls across vendors and doing sentiment analysis that way (looking for upticks in problem calls related to specific TSPs or whatever)
- etc
False positives and negatives aren't really a problem once you hit a certain scale because you're just looking for trends. If you find one, you go spot-check it and do a deeper dive to get better accuracy.
Which is also how you end up with some schlepp like me listening to a few hundreds calls in a day at 8x speed (back when I was a QA data analyst) to verify the bucketing. And when I was doing it everything was based on phonetic indexing, which I can't imagine touching llms in terms of accuracy, and it still provided a ton of business value at scale.
vosper 2 hours ago [-]
AI reads them and identifies trends and patterns, or answers questions from PMs or others?
glimshe 2 hours ago [-]
This makes sense. AI is obviously useful for many things. But people wouldn't invest tens of billions to summarize data center calls and similar tasks. Replacing data center workers isnt where the money is - it's replacing 100K-200K/year workers.
generic92034 2 hours ago [-]
> It's not going to replace anyone's job.
Is it not, in the scenario you are describing? You are saying the agents are free now to do higher-value work. Why were there not enough agents before, especially if higher-value work was not done?
cube00 2 hours ago [-]
It's such a useless platitude. The "higher value work" is answer more calls so we can have less staff on queue.
hobs 2 hours ago [-]
Because call centers are cost centers - nobody pays a dime more than they have to in these situations and its all commodity work.
generic92034 2 hours ago [-]
But that means the so-called "higher-value" work does not need to be done, so agents can be fired.
actsasbuffoon 2 hours ago [-]
That’s the thing. There’s value in AI, it’s just not worth half a trillion dollars to train a new model that’s 0.4% better on benchmarks. Meta is never going to get a worthwhile return on spending $100M on individual engineers.
But that doesn’t mean AI is without its uses. We’re just in that painful phase where the hype needs to die down and we treat LLMs as what they really are; an interesting new tool in the toolkit that provides some new ways to solve problems. It’s almost certainly not going to turn into AGI any time soon. It’s not worth trillions. It’s certainly worth something, though.
I think the financials on developing new frontier models are terrible. But I’ve already built multiple AI projects for my company that are making money and we’ve got extremely happy customers.
Investors thought one company was going to win the AI Wars and make a quadrillion dollars. Instead it’s probably going to be 10,000 startups that will build interesting products based on AI, and training new models won’t actually be a good financial move.
Imanari 1 hours ago [-]
Could you broadly describe the AI projects you have built?
trueismywork 54 minutes ago [-]
Pinn for simulations
doorhammer 1 hours ago [-]
I'm curious, have you noticed an impact on agent morale with this?
Specifically: Do they spend more time actually taking calls now? I guess as long as you're not at the burnout point with utilization it's probably fine, but when I was still supporting call centers I can't count the number of projects I saw trying to push utilization up not realizing how real burnout is at call centers.
I assume that's not news to you, of course. At a certain utilization threshold we'd always start to see AHTs creep up as agents got burned out and consciously or not started trying to stay on good calls.
Guess it also partly depends on if you're in more of a cust serv call center or sales.
I hated working as an actual agent on the phones, but call center ops and strategy at scale has always been fascinating.
lljk_kennedy 57 minutes ago [-]
Thank you, I came to say this too. You're mushing your humans harder, and they'll break. Those 5 mins of downtime post-call aren't 100% note taking - it's catching their breath, trying to re-compose after dealing with a nasty customer, trying to re-energise after a deep technical session etc.
I think AI in general is just being misused to optimise local minima in detriment to the overall system.
ethagknight 1 hours ago [-]
This highlights the potentially unrealistic spend on AI, where the proposal is to spend 10s of millions to save... marginal millions... that could have also been saved by a change in process with limited additional spend.
I also would assume that there are far more significant behavioral or human factors that consume the time writing those minutes, i.e. an easy spot to kill 5-10 min before opening the line for the next inbound call, but the 5-10 minute break will persist anyway.
I fully believe AI will create a lot of value and is revolutionary, especially for industries where value is hidden within data. Its the pace of value creation that stands out to me (how long til its actually useful and better and creates more value than it costs??) but the bubble factor is not ignorable on the near term.
didibus 1 hours ago [-]
Where is the money being saved? Are you reducing the number of agents? Otherwise, it should actually cost more, before you simply had customers wait longer to speak to the next agent no? Or do you sell "support calls" so you're able to sell more of them given the same number of agents?
tux3 2 hours ago [-]
> it's a huge, measurable efficiency gain.
> It's not going to replace anyone's job
Mechanically, more efficiency means less people required for the same output.
I understand there is no evidence that any other sentence can be written about jobs. Still, you should put more text in between those two sentences. Reading them so close together creates audible dissonance.
missedthecue 44 minutes ago [-]
"Mechanically, more efficiency means less people required for the same output."
Why can't it mean more output with the same number of people? If I pay 100 people for 8 hours of labor a day, and after making some changes to our processes, the volume of work completed is up 10% per day, what is that if not an efficiency gain? What would you call it?
It really depends on the amount of work. If the demand for your labor is infinite, or at least always more than you can do in a days work, efficiency gains won't result in layoffs, just more work completed per shift. If the demand for the work is limited, efficiency gains will likely result in layoffs because there's no point in paying someone who was freed up by your new processes to sit around twirling a pen all day.
tux3 27 minutes ago [-]
All else equal, the demand for support calls doesn't go up as your support becomes more efficient.
I get that we're trying to look for positive happy scenarios, but only considering the best possible world instead of the most likely world is bias. It's Optimistic in the sense of Voltaire.
missedthecue 26 minutes ago [-]
What i'm saying is that if the volume of support is high enough, and never even changed, it's completely possible to improve throughput without reducing demand for labor. The result is simply that you improve response times.
tux3 16 minutes ago [-]
But I think this comes back to the same question of understaffing/overwork. We have to ask what strategic thinking led to accept long response times in the past. And the answer is unequivocal.
Unless we're claiming there is an intractable qualified labor shortage in call centers, this is always the result of a much simpler explanation: it's much cheaper to understaff call centers
A company that wants to save money by adding more AI is a company that cares about cost cutting. Like most companies.
The strategy that caused the company to understaff have not changed. The result is that we go back to homeostasis, and less jobs are needed to reach the same deliberate target.
flkiwi 2 hours ago [-]
You're not accounting for teams already being understaffed and overtasked in many situations, with some AI tools allowing people to get back to doing their jobs. We aren't expecting significant headcount changes, but we are expecting significant performance and quality improvements for the resources we have.
tux3 2 hours ago [-]
The reason that caused the team to be understaffed and overtasked has not gone away because of AI. I am expecting the team to stay understaffed and overtasked, for the same reason it was before: it's less expensive. With or without an LLM summarizing phone calls.
tempodox 39 minutes ago [-]
That’s an important point. Real-life use cases are not sexy. And they don’t lend themselves to overblown hype generation and “creative marketing”.
chasd00 1 hours ago [-]
so you're saving 3-5 minutes per agent per call. I'm guessing calls come into a queue and then the next available agent starts to handle it. If an average call takes about 20min until the agent hangs up and is free for another then after about 5 calls they've saved enough time to take an extra call they wouldn't have before. 5 calls is 1.4 hrs on the phone, i'm guessing with breaks and call center reps not being 100% on the ball all the time then your agents probably will take maybe 3-4 more calls per day with the AI than without (assuming call volume is such that there are always more calls than agents can handle)
Is that really millions of savings annually? Maybe it is but I always hesitate when a process change that saves one person a few minutes is extrapolated all the way out to dollars/year. What you'll probably see is the agents using those 3-5 minutes to check their phone.
amluto 2 hours ago [-]
I’d like to see a competent AI replace the time that doctors and nurses spend tediously transcribing notes into a medical record system. More time spent doing the actual job is good for pretty much everyone.
beart 2 hours ago [-]
But... that is the actual job. A clear medical history is very important, and I'm not ready yet to cut out my doctor from that process.
This reminds me of the way juniors tend to think about things. That is, writing code is "the actual job" and commit messages, documentation, project tracking, code review, etc. are tedious chores that get in the way. Of course, there is no end to the complaints of legacy code bases not having any of those things and being difficult to work with.
hinkley 50 minutes ago [-]
Not just juniors. Industry is full of senior and some staff engineers who see discipline as a waste of time.
The number of things I do in a day that half my coworkers see as a waste of time until they enjoy the outcomes is basically uncountable at this point.
If something is a “waste of time” it’s possible that you’re just lousy at it.
Self reflection is a rarer commodity than it should be. And most of the tasks you list either require or invite it.
wl 1 hours ago [-]
Charting is for billing. If the point were to have accurate medical records useful for facilitating diagnosis and treatment, we'd structure medical records way differently. Fishing clinically-useful bits of information out of encounter and progress notes is tedious and only done as a last resort.
hinkley 49 minutes ago [-]
I presume for malpractice suits as well.
nottorp 2 hours ago [-]
Competent, yes. But the current ones are likely to transcribe "recommend amputating the left foot" as "recommend amputating the right foot". Still want it?
simmerup 2 hours ago [-]
Until it hallucinates and the AI has written something wrong about you in your official medical record
pjmorris 2 hours ago [-]
I wonder how this change affects what the agents remember about the calls, and how that affects their performance on future calls.
And I wonder whether agent performance, as measured by customer satisfaction, will decline over time, and whether that will affect the bottom line.
ghalvatzakis 2 hours ago [-]
I lead an AI engineering team that automated key parts of an interviewing process, saving thousands of hours each month by handling thousands of interviews. This reduced repetitive, time-consuming tasks and freed human resources to focus on higher-value work
the_snooze 2 hours ago [-]
I'm under the impression that one of the most critical responsibilities a lead has is to establish and maintain a good working culture. Properly vetting new additions feeds directly into that. Why offload it to AI?
ghalvatzakis 1 hours ago [-]
Just to clarify, these aren’t interviews for job positions
Jolter 58 minutes ago [-]
What kind of interviews do HR do, apart from job interviews?
ghalvatzakis 15 minutes ago [-]
Not HR either. I work for an experts network firm
hinkley 46 minutes ago [-]
Clear as mud.
Terr_ 2 hours ago [-]
I think the biggest issue is accurately estimating the LLM failure risk, and what impacts the company is willing to tolerate in the long term. (As distinct from what the company is liable to permit through haste and ignorance.)
With LLMs the risk is particularly hard to characterize, especially when it comes to adversarial inputs.
trevor-e 2 hours ago [-]
This is a great use-case of AI.
However I strongly doubt your point about "It's not going to replace anyone's job" and that "they also free up the human agents to do higher-value work". The reality in most places is that fewer agents are now needed to do the same work as before, so some downsizing will likely occur. Even if they are able to switch to higher-value work, some amount of work is being displaced somewhere in the chain.
And to be clear I'm not saying this is bad at all, I'm just surprised to see so many deluded by the "it won't replace jobs" take.
It's also disappointing that MIT requires you to fill out a form (and wait for) access to the report. I read four separate stories based on the report, and they all provide a different perspective.
I wouldn't allow myself to be held accountable for anything in a summary I didn't write.
nuker 2 hours ago [-]
> recently switched to using AI to transcribe and write these summaries
Did users knew that conversation was recorded?
creaturemachine 1 hours ago [-]
Yeah the standard "this call may be recorded for quality or training purposes" preamble shouldn't cover for slurping your voiceprint to further the butchering of client service that this call centre is here for.
prophesi 2 hours ago [-]
You would be hard-pressed to find a call center that _doesn't_ start every phone call with a warning that the conversation may be recorded.
watwut 1 hours ago [-]
Typical call center call is recorded and you are told so by the start of the conversation. I had quite a few of those.
MangoToupe 1 hours ago [-]
> We operate large call centers, and agents were previously spending 3-5 minutes after each call writing manual summaries of the calls.
This is a tiny fraction of all work done. This is work people were claiming to have solved 15 years ago. Who cares?
positron26 2 hours ago [-]
Given that people skimp on work that is viewed as trash anyway, how were you getting value out of the summaries in the first place?
thrown-0825 2 hours ago [-]
they weren't
its likely a checkbox for compliance or some policy a middle manager put in place that is now tied to a kpi
positron26 2 hours ago [-]
Could be CRM, leaving summaries for the next person. I suppose it would sound like I'm implying a prior.
croes 1 hours ago [-]
But I doubt it justifies the billions of dollars getting burned for training language models and building power plants.
And are full transcriptions not the better option?
Capricorn2481 2 hours ago [-]
> Not only are the summaries better than those produced by our human agents
We have someone using Firefly for note taking, and it's pretty bad. Frequently gets details wrong or extrapolates way too much from a one-off sentence someone said.
How do you verify these are actually better?
belter 2 hours ago [-]
Are the summaries reviewed by the agents? And if not how do you handle hallucinations, or transcribe of wrong insurance policy id for example? Like, customer wants to cancel insurance policy AB-2345D and transcribe says wants to cancel insurance policy AD-2345B
hobs 2 hours ago [-]
That is a good thing, but that's also just a training gap - I worked with tech support agents for years in gigantic settings and taking notes while you take an action is difficult to train but yields tangible results, clarity from the agent on what they are doing step by step, and builds a shared method of documenting things that importantly focuses on the important details and doesn't miss a piece which may be considered trivial by an outsider but (for instance) defines SOP for call escalation and the like.
apwell23 1 hours ago [-]
i was pretty sure you were going to say "meeting summaries" ( which apparently is poster child of LLM application) .
my guess was wrong but not really.
varispeed 2 hours ago [-]
What’s the actual business value of a “summary” though? A transcript is the record. A tag or structured note (“warranty claim,” “billing dispute,” “out of scope”) is actionable. But a free-form blob of prose? That’s just narrative garnish - which, if wrong or biased, is worse than useless.
Imagine a human agent or AI summarises: “Customer accepted proposed solution.” Did they? Or did they say “I’ll think about it”? Those aren’t the same thing, but in the dashboard they look identical. Summaries can erase nuance, hedge words, emotional tone, or the fact the customer hung up furious.
If you’re running a call centre, the question is: are you using this text to drive decisions, or is it just paperwork to make management feel like something is documented? Because “we saved millions on producing inaccurate metadata nobody really needs” isn’t quite the slam dunk it sounds like.
ddddang 2 hours ago [-]
[dead]
computerthings 1 hours ago [-]
[dead]
zeromyte 2 hours ago [-]
[dead]
pluc 2 hours ago [-]
You could do that offline 20 years ago with Dragon Naturally Speaking and a one-time licence.
dymk 2 hours ago [-]
You could get a transcript, not a summary
loloquwowndueo 2 hours ago [-]
Only if your audio was crystal clear, you spoke like a robot very slowly, and each customer has to do a 30-minute “please read this text slowly to train the speech recognition software”
preamble before talking to the actual human.
butlike 2 hours ago [-]
How can you double-check the work? Also, what happens when the AI transcription is wrong in a way that would have terminated the employee. You can't fire a model.
Finally, who cares about millions saved (while considering the above introduced risk), when trillions are on the line?
PaulRobinson 2 hours ago [-]
Having a human read a summary is way faster than getting them to write it. If they want to edit it, they can.
AI today is terrible at replacing humans, but OK at enhancing them.
Everyone who gets that is going to find gains - real gains, and fast - and everyone who doesn't, is going to end up spending a lot of money getting into an almost irreversible mistake.
butlike 2 hours ago [-]
"Reading a summary is faster, so enhancing humans with AI is going to receive boons or busts to the implementer."
Now, summary, or original? (Provided the summary is intentionally vague to a fault, for arguments sake on my end).
throitallaway 2 hours ago [-]
I presume they're not using these notes for anything mission or life critical, so anything less than 100% accuracy is OK.
butlike 2 hours ago [-]
I disagree with the concept of affluvic notes. All notes are intrinsically actionable; it's why they're a note in the first place. Any note has unbounded consequence depending on the action taken from it.
wredcoll 2 hours ago [-]
You're being downvoted, I suspect for being a tad hyperbolic, but I think you are raising a really important point, which is just the ever more gradual of removing a human's ability to disobey the computer system running everything. And the lack of responsibility for following computer instructions.
It's a tad far-fetched in this specific scenario, but an AI summary that says something like "cancel the subscription for user xyz" and then someone else takes action on that, and XYZ is the wrong ID, what happens?
JCM9 2 hours ago [-]
We are entering the “Trough of disillusionment.” These hype cycles are very predictable. GPT-5 being panned as a disappointment after endless hype may go down as GenAI’s “jump the shark” moment.
It’s all fun and games until the bean counters start asking for evidence of return on investment. GenAI folks better buckle up. Bumps ahead. The smart folks are already quietly preparing for a shift to ride the next hype wave up while others ride this train to the trough’s bottom.
Cue a bunch of increasingly desperate puff PR trying to show this stuff returns value.
highwaylights 2 hours ago [-]
I wouldn’t be surprised if 95% of companies knew this was a money pit but felt obligated to burn a pile of money on it so as not to hurt the stock price.
generic92034 2 hours ago [-]
In Germany there is the additional issue of companies only really starting to invest into the hype when the hype cycle is already at its end, in other parts of the world. And do not imagine that would lower the investments or shorten the amount of time spent on the hype. The C level can never admit errors, the middle management only sees a way to promotion by following the hype.
lenerdenator 2 hours ago [-]
I also wouldn't be surprised if bean counters were expecting a return in an unreasonable amount of time.
"Hey, guys, listen, I know that this just completely torched decades of best practices in your field, but if you can't show me progress in a fiscal year, I have to turn it down." - some MBA somewhere, probably, trying and failing yet again to rub his two brain cells together for the first time since high school.
Just agentic coding is a huge change. Like a years-to-grasp change, and the very nature of the changes that need to be made keep changing.
omnicognate 1 hours ago [-]
> Just agentic coding is a huge change
I've been programming professionally for > 20 years and I intend to do it for another > 20 years. The tools
available have evolved continually, and will continue to do so. Keeping abreast of that evolution is an important part of the job. But the essential nature of the role has not changed and I don't expect it to do so. Gen AI is a tool, one that so far to me feels very much like IDE tooling (autocomplete, live diagnostics, source navigation): something that's nice to have, that's probably worth the time, and maybe worth the money, to set up, but which I can easily get by without and experience very little disadvantage.
I can't see the future any more than anyone else, but I don't expect the capabilities and limitations of LLMs to change materially and I don't expect to be left in the dust by people who've learned to wrangle wonders from them by dark magics. I certainly don't think they've "torched decades of best practice in my field". I expect them to improve as tools and, as they do, I may find myself using them more as I go about my job, continuing to apply all of the other skills I've learned over the years.
And yes, I do have an eye-wateringly expensive Claude subscription and have beheld the wonders of Opus 4. I've used Claude Code and worked around its shitty error handling [1]. I've seen it one-shot useful programs from brief prompts, programs I've subsequently used for real. It has saved me non-zero amounts of time - actual, measurable time, which I've spent doodling, making tea and thinking. It's extremely impressive, it's genuinely useful, it's something I would have thought impossible a few years ago and it changes none of the above.
> "Hey, guys, listen, I know that this just completely torched decades of best practices in your field, but if you can't show me progress in a fiscal year, I have to turn it down."
I mean, this is basically how all R&D works, everywhere, minus the strawman bit about "single fiscal year", which isn't functionally true.
And this is a serious career tip: you need to get good at this. Being able to break down extremely ambitious, many-year projects into discrete chunks that prove progress and value is a fundamental skill to being able to do big things.
If a group of very smart people said "give us ${BILLIONS} and don't bother us for 15 years while we cook up the next world-shaking thing", the correct response to that is "no thanks". Not because we hate innovation, but because there's no way to tell the geniuses apart from the cranks, and there's not even a way to tell the geniuses-pursuing-dead-ends from the geniuses-pursuing-real-progress.
If you do want to have billions and 15 years to invent the next big thing, you need to be able to break the project up to milestones where each one represents convincing evidence that you're on the right track. It doesn't have to be on an annual basis, but it needs to be on some cadence.
beepbooptheory 2 hours ago [-]
"Actually its good we aren't making money, this actually proves how revolutionary the technology is. You really need to think about adapting to our new timeline."
You really set yourself up with a nice glass house trying to make fun of the money guys when you are essentially just moving your own goal posts. It was annoying two (or three?) years ago when we were all talking about replacing doctors and lawyers, now it just cant help but feel like a parody of itself in some small way.
Spivak 2 hours ago [-]
How dare the business ask for receipts of value being produced in actual
dollars! Those idiots don't know anything.
58 minutes ago [-]
dingnuts 2 hours ago [-]
Sam Altman and company have been promising full on AGI. THAT'S the price shock.
Agents may be good (I haven't seen it yet, maybe it's a skill issue but I'm not spending hundreds of dollars to find out and my company seems reluctant to spend thousands to find out) but they are definitely, definitely not general superintelligence like SamA has been promising
at all
really is sinking in
these might be useful tools, yes, but the market was sold science fiction. We have a useful supercharged autocomplete sold as goddamn positronic brains. The commentariat here perhaps understood that (definitely not everyone) but it's no surprise that there's a correction now that GPT-5 isn't literally smarter than 95% of the population when that's how it was being marketed
wredcoll 2 hours ago [-]
It's real good for stock prices though. Reminds me of tesla.
lubesGordi 2 hours ago [-]
Agreed agentic coding is a huge change. Smart startups will be flying but aren't representative. Big companies won't change because the staff will just spend more time shopping online instead of doing more than what is asked of them. Maybe increased retail spend is a better measure of AI efficacy.
JCM9 2 hours ago [-]
FOMO on the way up to the peak is a powerful force. Now that we’re sliding down the other side FOMO turns into “WTF did we just spend all that money on again?”
pgwhalen 2 hours ago [-]
It's hard to define what it means for a company to know something, but as a person inside a company spending on gen AI efforts, I'm pretty confident that we're not investing in it just to maintain an elevated valuation (we're a mature, privately owned company).
1 hours ago [-]
1 hours ago [-]
empath75 2 hours ago [-]
There were similar headlines in the late 80s and early 90s as IT in general was widely seen to have been a money wasting bust. Most people who try to use new technologies, especially early adopters waste a shitload of money and don't accomplish very much.
runarberg 35 minutes ago [-]
My favorite conspiracy theory at the moment is that this is a way for the rich to literally burn the excess money to prevent it from getting back to the working classes in an effort to keep the exploitation machine running.
Now, I don’t believe this is an actual conspiracy, but rather a culture of hating the poor. The rich will jump on any endeavor—no matter how ridiculous—as long as the poor stay poor, even if they loose money in the process.
no_wizard 2 hours ago [-]
Gemini keeps being rather impressive though, even their iterative updates have improvements, though I'm seeing a significant slowdown in the improvements (both quantity and in how much they improve) suggestion a wall may be approaching.
That said, technologies like this can also go through a rollercoaster pattern itself. Lots of innovation and improvement, followed by very little improvement but lots of research, which then explodes more improvements.
I think LLMs have a better chance at following that pattern than computer vision did when that hype cycle was all the rage
chriskanan 1 hours ago [-]
Sam Altman way oversold GPT-5's capabilities, in that it doesn't feel like a big leap in capability from a user's perspective; however, the a idea of a trainable dynamic router enabling them to run inference using a lot less compute (in aggregate) to me seems like a major win. Just not necessarily a win for the user (a win for the electric grid and making OpenAI's models more cost competitive).
pseudosavant 59 minutes ago [-]
When OpenAI went from GPT-3.5-Turbo to GPT-4 it seemed massive, and there were no other steps in-between. And nobody else had a meaningful competitive model out yet.
When GPT-5 came out, it wasn't going from GPT-4 to GPT-5. Since GPT-4 there has been: 4o, o1, o3, o3-mini, o4-mini, o4-mini-high, GPT-4.1, and GPT-4.5. And many other models (Llama, DeepSeek, Gemini, etc) from competitors have been released too.
We'll probably never experience a GPT-3.5 to GPT-4 jump again. If GPT-5 was the first reasoning model, it would have seemed like that kind of jump, but it wasn't the first of anything. It is trying to unify all of the kinds of models OpenAI has offered, into one model family.
seatac76 2 hours ago [-]
Good point. I wonder if the Windsurf folks saw the writing on the wall and cashed out when they could.
herval 1 hours ago [-]
I was a big Windsurf fan. What they did with their team became a massive cautionary tale around the current wave of startups. It's creating a distrust culture that's gonna be very hard to repair (not that it matters for the handful of newly-minted billionaires)
mattlondon 1 hours ago [-]
Perhaps just a trough of disillusionment for OpenAI. Anthropic and Google keep delivering.
neom 52 minutes ago [-]
hahaaa! I lead a growth team for a genAI company and on Monday I said to the team "we need to start to put content together that proves our customers are having return and finding value, because we're entering the trough of disillusionment"
...I'll try not to sound desperate tho.
deadbabe 1 hours ago [-]
If GPT-5 had been released last year instead, they could have probably kept the hype manageable. But they waited too long and too greedily, and the launch fell flat. And in some cases even negative as I’m seeing some bad PR about people who got too attached to their GPT-4o lovers and hate the new GPT-5.
This is such a huge repeat of the early 2000s. All the bust startups spent billions in infrastructure. Everyone built their own datacenters, no matter what your business is.
We'll either see a new class of "AWS of AI" companies that'll survive and be used by everyone (that's part of the play Anthropic & OpenAI are making, despite API generating a fraction of their current revenue), or Amazon + Google + Microsoft will remain as the undisputed leaders.
chasd00 56 minutes ago [-]
I remember the first dot com bust, you could find Herman Miller Aerons (the stereotypical inet-startup-guy chair) super cheap as well as fairly large Cisco 6509 routers and ...i think it was Sun Fire 15ks lol. I look forward to getting some nice GPUs at a discount.
idk what a person would do with a 6509 or a Sun Fire hah but they were all over craigslist iirc.
frozenport 2 hours ago [-]
Yo what’s the next hype cycle that smart folks like us should be working on?
rpcope1 16 minutes ago [-]
Probably defense and robotics.
Terr_ 2 hours ago [-]
The next hype-cycle might not relate to software. I'm thinking of the (smaller, shorter) phase where "nanotechnology" was getting slapped onto everything including laundry detergent.
zzzeek 46 minutes ago [-]
mRNA was set to be huge but US voters apparently didnt want it.
quotemstr 2 hours ago [-]
Defense will increasingly become a national priority over the next few decades. Pax Americana is teetering.
deepdarkforest 2 hours ago [-]
That and consumer robotics. The latter will explode if (big if) RL and llm reasoning get combined into something solid. Lots and lots of smart people are working on it already of course, we are seeing great improvements but nothing really usable. i think we will finally get to a real hype stage in maybe 3-4 years
2 hours ago [-]
belter 2 hours ago [-]
The US mix of government and enterprise. You want Capital you need to be MAGA
Smartest people will be working against that. You're thinking of opportunistic people with myopia.
herval 1 hours ago [-]
Pretty much the entire tech industry has bent the knee by now - they even gifted the new ruler with golden statues. It's not just a handful of people...
gmd63 1 hours ago [-]
Few people have the balls to do the right thing when the risks pass a certain limit. And yet it's most important to do the right thing at the largest scale.
belter 2 hours ago [-]
The law firms failed to do so.
gmd63 1 hours ago [-]
Some of them have certainly flagged themselves as opportunistic and myopic.
kylebenzle 2 hours ago [-]
[dead]
IgorPartola 2 hours ago [-]
What are the actual use cases that can generate revenue or at least save costs today? I can think of:
1. Generate content to create online influence. This is at this point probably way oversaturated and I think more sophisticated models will not make it better.
2. Replace junior developers with Claude Code or similar. Only sort of works. After all, you can only babysit one of these at a time no matter how senior you are so realistically it will make you, what, 50% more productive?
3. Replace your customer service staff. This may work in the long run but it saves money instead of making money so its impact has a hard ceiling (of spending just the cost of electricity).
4. Assistive tools. Someone to do basic analysis, double check your writing to make it better, generate secondary graphic assets. Can save a bit of money but can’t really make you a ton because you are still the limiting factor.
Aside: I have tried it for editing writing and it works pretty well but only if I have it do minimal actual writing. The more words it adds, the worse the essay. Having it point out awkward phrasing and finding missing parts of a theme is genuinely helpful.
5. AI for characters in video games, robot dogs, etc. Could be a brave new frontier for video games that don’t have such a rigid cause/effect quest based system.
6. AI girlfriends and boyfriends and other NSFW content. Probably a good money maker for a decade or so before authentic human connections swing back as a priority over anxiety over speaking to humans.
What use cases am I missing?
mbb70 12 minutes ago [-]
In healthcare, notes are directly correlated with $$$ for the hospital, because everything that is billed for must be documented with a mix of metrics (O2%, temp, lab results), events (orders, prescriptions, procedures) and notes (consultation notes, imaging interpretations, discharge summaries).
Billions get spent annually in administrative overhead focused on squeezing the most money out of these notes as possible. A tremendous expense can be justified to increase note quality (aka revenue, though 'accuracy/efficiency' is the trojan horse used to slip by regulators).
GenAI has a ton of potential there. Likewise on the insurance side, which has to wade through these notes and produce a very labor intensive paper trail of their own.
Eventually the AIs will just sling em-dashes at each other while we sit by pool.
spogbiper 2 hours ago [-]
I am working on a project that uses LLM to pull certain pieces of information from semi-structured documents and then categorize/file them under the correct account. it's about 95% accurate and we haven't even begun to fine tune it. i expect it will require human in the loop checks for the foreseeable future, but even with a human approval of each item, its going to save the clerical staff hundreds of hours per year. There are a lot of opportunities in automating/semi-automating processes like this, basically just information extraction and categorization tasks.
systemerror 2 hours ago [-]
The big issue with LLMs is that they’re usually right — like 90% of the time — but that last 10% is tough to fix. A 10% failure rate might sound small, but at scale, it's significant — especially when it includes false positives. You end up either having to live with some bad results, build something to automatically catch mistakes, or have a person double-check everything if you want to bring that error rate down.
f3b5 33 minutes ago [-]
Depending on the use case, a 10% failure rate can be quite acceptable. This is of course for non-critical applications, like e.g. top-of-funnel sales automation. In practice, for simple uses like labeling data at scale, I'm actually reaching 95-99% accuracy in my startup.
spogbiper 2 hours ago [-]
yes, the entire design relies on a human to check everything. basically it presents what it thinks should be done, and why. the human then agrees or does not. much work is put into streamlining this but ultimately its still human controlled
wredcoll 1 hours ago [-]
At the risk of being obvious, this seems set up for failure in the same way expecting a human to catch an automated car's mistakes is. Although I assume mistakes here probably don't matter very much.
LPisGood 1 hours ago [-]
This reminds me the issue with the old windows access control system.
If those prompts pop up constantly asking for elevated privileges, this is actually worse because it trains people to just reflexively allow elevation.
spogbiper 36 minutes ago [-]
yes, mistakes are not a huge problem. they will become evident farther down the process and they happen now with the human only system. worst case is the LLM fails and they just have to do the manual work that they are doing now
whatever1 2 hours ago [-]
All of the AI projects promise that they just need some fine tuning to go from poc to actual workable product. Nobody was able to fine tune them.
Sorry this is some bull. Either it works or it doesn’t.
LPisGood 1 hours ago [-]
> its going to save the clerical staff hundreds of hours per year
How many hundreds of hours is your team spending to get there? What is the ROI on this vs investing that money elsewhere?
spogbiper 31 minutes ago [-]
Can't speak to the financial benefit over other investment. Total dev/testing time looks to be fairly small in comparison to time saved in even one year, although with different salaries etc I cannot be too certain on the money ratio. Ultimately not my direct concern, but those making decisions are very happy with results so far and looking for additional processes to apply this type of system to.
kjkjadksj 2 hours ago [-]
Isn’t that something you can do with non ai tooling to 100% accuracy?
spogbiper 2 hours ago [-]
in some similar cases yes, and this client has tried to accomplish that for literally decades without success. i don't want to be too detailed for reasons, but basically they cannot standardize the input to the point where anything non AI has been able to parse it very well.
beepbooptheory 2 hours ago [-]
How will you know in practice which 5% is wrong?
spogbiper 2 hours ago [-]
the system presents a summary that a human has to approve, with everything laid out to make that as easy as possible, links to all the sources etc
b8 2 hours ago [-]
AI Customer Service is very frustrating to work with as a end user.
siliconc0w 1 hours ago [-]
The problem is that a lot of friction is intentional. Companies want there to be friction to return an item or cancel a subscription. Insurance companies want there to be friction to evaluate policies or appeal denied claims. Companies create legal friction to make competition harder. The friction is the point so AI isn't a solution. If you were to add AI they'd just find a way to create new friction.
timeinput 45 minutes ago [-]
So are call trees where you have to answer yes / no to a decision tree (and can't press 0 / 1, you have to verbalize "yes" or "no"). Those continue to exist, so I expect AI customer 'service' to be very frustrating to work with for quite some time.
jmkni 1 hours ago [-]
Yep.
The thing is, you aren't contacting customer services because everything is going well, you are contacting them because you have a problem.
The last thing you need is to be gaslit by an AI.
The worst ones are the ones where you don't realise right away you aren't talking to a person, you get that initial hope that you've actually gotten through to someone who can help you (and really quickly too) only to have it dawn on you that you are talking to a ChatGPT wrapper who can't help you at all.
Spivak 1 hours ago [-]
I mean look, if the customer service department is just trying to frustrate you into not contacting them then AI is just a new tool in the belt for that. Sad, but improvement for
them is explicitly worse for you.
But if you're actually trying to provide good customer service because people are paying you for it any paying per case then you wouldn't dare put a phone menu or AI chat bot in-between them and the human. The person handles all the interaction with the client and then uses AI where it's useful to speed up the actual work.
wedn3sday 2 hours ago [-]
One use case I'd love to see an easy plug-and-play solution for is a RAG build around companies vast internal documentation/wikis/codebase to help developers onboard and find information faster. I would love to see less of people trying to replace humans with language models and more of people trying to use language models to make humans jobs less frustrating.
We connect with slack/notion/code/etc so that you can do the following:
1. Ask questions about how your code/product works
2. Generate release notes instantly
3. Auto update your documentation when your code changes
We primarily rely on the codebase since it is never out of date
OutOfHere 2 hours ago [-]
In all the companies I have worked at and have looked at such docs, unfortunately this doesn't really work because those internal documentation sites are statistically never up to date or even close. They are hilariously unclearly written or out of date.
As for relying on the code base, that is good for code, although not for onboarding/deployment/operations/monitoring/troubleshooting that have manual steps.
plantain 2 hours ago [-]
+50% productivity for 200$/mo is outstanding value! Most countries have 0-2% productivity growth per year!
IgorPartola 15 minutes ago [-]
This is a one time gain. Think of it like instead of doing the work yourself you are just pair programming with a junior developer whom you cannot trust to write secure and bug free code. You can probably optimize your interactions such that this “intern” does work while you do other work. But if it’s work blocks what you are doing, it’s just that now you work differently.
I toyed with it and found it to be less frustrating to set up the latest layout for a VueJS project, but having it actually write code was… well I had to manually rewrite large chunks of it after it was done. I am sure it will improve but how long until you can tell it the specs, have it work for a few minutes or hours or days, and come back to an actual finished project? My bet is decades to never.
mrweasel 45 minutes ago [-]
Part of the issue with all this AI hype is that I can't tell if you're joking or not. Most of those suggestions are horrible, 4 and 5 makes sense.
jaimebuelta 2 hours ago [-]
Stock images. I’ve already seen trining courses (for compliance reasons) using AI videos. A bit cringey, but I imagine cheaper than shooting real people.
wredcoll 1 hours ago [-]
> but I imagine cheaper than shooting real people
How much does that cost these days? Do you still have to fly to remote islands?
jsmith99 1 hours ago [-]
MS Copilot is quite useful for meeting minutes and summaries etc. Still not nearly as useful as good handwritten notes but saves loads of time.
IgorPartola 14 minutes ago [-]
Sure useful. But we are talking making money useful, not just nice. Like will it create 20% more revenue? 1% more? Or is it just a nice to have?
carlosjobim 43 minutes ago [-]
The most important use for AI is for translation. Now anybody can communicate and serve customers and clients from everywhere in the world, and you can also translate all your customer-facing material into any language in the world.
That means you expand from millions to billions of potential customers.
SalmoShalazar 2 hours ago [-]
Even “only sort of works” is too generous for point #2. A dozen Claude Code agents spamming out code is… something. But it still does not replace a human, at all, even a junior one. It’s something else entirely.
mapontosevenths 39 minutes ago [-]
I've spent the last week going through Claude generated code a line at a time and fixing it because a junior engineer thought he could just vibe code his way through a project and trust the machine. It would have been easier to write it myself from scratch.
Arguably, it's not the tools fault when someone uses it incorrectly, but my aching brain does not care whose fault it is right now, nor do the shareholders care why productivity cratered after we got shiny new tools.
empath75 1 hours ago [-]
> Replace junior developers with Claude Code or similar.
I don't know why everyone goes to "replacing". Were a bunch of computer programmers replaced when compilers came out that made writing machine code a lot easier? Of course not, they were more productive and accomplished a lot more, which made them more valuable, not less.
LPisGood 1 hours ago [-]
A lot of companies have upper limits on the value add of more programmers.
kingkawn 2 hours ago [-]
#5 is an enormous use case that when well implemented will permanently replace prescribed character arcs
It is uniquely susceptible because the gaming market is well acclimated to mediocre writing and one dimensional character development that’s tacked on to a software product, so the improvements of making “thinking” improvisational characters can be immense.
Another revenue potential you’ve missed is visual effects, where AI tools allow what were previously labor intensive and expensive projects to be completed in much less time and with less, but not no, human input per frame
shantara 2 hours ago [-]
>#5 is an enormous use case that when well implemented will permanently replace prescribed character arcs
I mostly disagree. Every gaming AI character demo I've seen so far is just adds more irrelevant filler dialogue between the player and the game they want to play. It's the same problem that some of the older RPG games had, thinking that 4 paragraphs of text is always better than 1.
kingkawn 1 hours ago [-]
I agree the implementation isn’t great, but mostly it’s because the devs aren’t well versed yet in setting the parameters for the AI’s personality and how rapidly it gets to the point. That’s true of all chatbot AIs out of the box at the moment it seems, but is fixable with an eye to the artistry of the output
jampa 2 hours ago [-]
The biggest mistake people are making is treating AI as a product instead of a feature.
While people are doing their work, they don't think, "Oh man, I am really excited to talk with AI today, and I can't wait to talk with a chatbot."
People want to do their jobs without being too bored and overwhelmed, and that's where AI comes in. But of course, we cannot hype features; we sell products after all, so that's the state we are in.
If you go to Notion, Slack, or Airtable, the headline emphasizes AI first instead of "Text Editor, Corporate Chat etc".
The problem is that AI is not "the thing", it is the "tool that gets you to the thing".
ryandrake 2 hours ago [-]
I wouldn't even call it a feature. It's enabling technology. I've never once said "I would like AI in [some product]." I say: "I would like to be able to [do this task]." If the company adds that feature to a product, I'll buy it. I don't care if the company used AI, traditional algorithms, or sorcery to make the feature work--I just care that it does what I want it to do.
Too many companies are just trying to spoon AI into their product somehow, as if AI itself is a desired feature, and are forgetting to find an actual user problem for it to actually solve.
rpcope1 1 hours ago [-]
All true, but then there goes your stratospheric valuations and all the crazy hype. This come to jesus moment may very well deflate one of the few remaining hot areas around software engineering..I could see people being reluctant to stop the hype train as then we'd really have to come to terms with the fact that the "industry" as a whole is kind of in the shitter and it's a less good time to be a software engineer across the board than 5 or 10 years ago.
the_snooze 2 hours ago [-]
I wouldn't mind it if it were presented as yet another tool in the box. Maybe have a one-time popup saying "Hey, there's this thing, here's a cool use case, go check it out on your own terms."
In reality, AI sparkles and logos and autocompletes are everywhere. It's distracting. It makes itself the star of the show instead of being a backup dancer to my work. It could very well have some useful applications, but that's for users to decide and adapt to their particular needs. The ham-fisted approach of shoving it into every UI front-and-center signals a gross sense of desperation, neediness, and entitlement. These companies need to learn how to STFU sometimes.
rank0 1 hours ago [-]
Seriously! The product itself is supposed to be the valuable thing…regardless of the underlying technology.
biophysboy 2 hours ago [-]
1000% agree, so many AI "applications" right now are solutions looking for a problem.
TimCTRL 2 hours ago [-]
I like this take, in fact i feel a little uneasy when i see startups mention "MCP" on their landing pages! Its a protocol and its like saying we use HTTP here.
I could be wrong but, all in all, buy a .com for your "ai" product, such that you survive the Dot-ai bubble [1]
Agreed. We’ve got the potential to build real bicycles for the mind here and marketing departments are jumping right in to trying to sell people spandex cycling shorts.
Workaccount2 49 minutes ago [-]
How unfortunate would it be if people actually read the report
> While only 40% of companies say they purchased an official LLM
subscription, workers from over 90% of the companies we surveyed reported regular use of
personal AI tools for work tasks. In fact, almost every single person used an LLM in some
form for their work. In many cases, shadow AI users reported using LLMs multiples times a day every day of
their weekly workload through personal tools, while their companies' official AI initiatives
remained stalled in pilot phase.
Corporate initiatives are failing, but people are using LLMs like crazy at work. This story is not the bombshell it's made out to be, in fact it could even go in the other direction.
tqi 2 hours ago [-]
The link to the report pdf redirects to a landing page[1] that makes me think this "study" is run-of-the-mill content marketing. The CTA at the bottom of the page says "Join the Revolution / Be at the forefront of creating the agentic web. Get exposure and faster adoption for your AI products and tools," which certainly doesn't give the impression that this is an objective report. Either way I can't speak to the quality of the study directly because it is walled off behind a contact information form, which is another bad sign.
From the article though:
> But researchers found most use cases were limited to boosting individual productivity rather than improving a company’s overall profits.
From the article: "But for 95% of companies in the dataset, generative AI implementation is falling short. The core issue? Not the quality of the AI models, but the “learning gap” for both tools and organizations. While executives often blame regulation or model performance, MIT’s research points to flawed enterprise integration."
You could have a pile of cash and 95% of companies will fail to see zero return because they don't know how to pick up the money.
eldenring 2 hours ago [-]
This is how America ends up being ahead of the rest of world with every new technology breakthrough. They spend a lot of money, lose a lot of money, take risks, and then end up being too far for others to catch up.
Trying to claim victory against AI/US Companies this early is a dangerous move.
pepinator 3 minutes ago [-]
That's an awfully simplistic way of seeing things. It always surprises me how disconnected from reality Americans are.
9dev 1 hours ago [-]
I’m not sure you can even generalise that much anymore. The US banking system is stuck in the dark ages compared to other countries, China is leading in electric cars, photovoltaics. and lots of other industries.
You could make that claim for the software industry, but I’m pretty sure a big part of the US moat is due to oligopolies, lock-in effects, or corruption in favour of billionaires and their ventures.
hiddencost 31 minutes ago [-]
Like solar power? Or electric cars? Or drones?
ta20240528 2 hours ago [-]
> This is how America ends up being ahead of the rest of world with every new technology breakthrough.
Too young to remember GSM?
eldenring 2 hours ago [-]
I don't think its controversial to say that in net, the US has had a good streak so far with new technologies. Yeah there have been some pretty big misses, but obviously the wins more than make up for them.
2 hours ago [-]
vdupras 2 hours ago [-]
[flagged]
dang 41 minutes ago [-]
"Eschew flamebait. Avoid generic tangents."
"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize."
I have trouble understanding how that guideline applies here. The original article shows how it's possible that we're about to see an AI bubble pop, the parent comment show generic american arrogance[1], and I come up with a historical example of such a mix of hubris and arrogance.
If my comment can be characterized as flamebait, it has to be to a lesser degree than the parent, right?
And I'm not even claiming that the situation applies. If you take the strongest plausible interpretation of my comment, it says that if indeed this whole AI bubble is hubris, if indeed there's a huge fallout, then the leaders of this merry adventure, right now, must feel like Napoleon entering Moscow.
But well, anyways, cheers dang, it's a tough job.
[1]: the strongest possible interpretation of "This is how America ends up being ahead of the rest of world with every new technology breakthrough" is arrogance, right?
onlypassingthru 2 hours ago [-]
Wasn't half of Napoleon's army gone by the time he reached Moscow?[0] It's the sunk cost fallacy writ large with dead bodies, no?
Certainly, and they lost the other half on the way home. On the way to Moscow, there was still something to look forward to. In Moscow, with no supply, dying like flies, all they had to look forward to was a long march through the Almighty Intense Winter, during which they'd be dying like flies too.
md3911027514 2 hours ago [-]
It’s interesting how self-reports of productivity can be wrong.
I suspect there's a lot of nuance you can't really capture in a study like this.
How you use AI will depend on the model, the tools (claude-code vs cursor vs w/e), your familiarity and process (planning phases, vibe coding, etc.), and the team size (solo dev versus large team), your seniority and attention to detail, and hard to measure effects like an increased willingness to tackle harder problems you may have procrastinated on otherwise.
I suspect we're heading to a plateau. I think there's a ton of polish that can be done with existing models to improve the coding experience and interface. I think that we're being massively subsidized by investors racing to own this market, but by the time they can't afford to subsidize it anymore, it'll be such a commodity that the prices won't go up and might even go down regardless of their individual losses.
As someone who knows they are benefitting from AI (study shmuddy), I'm perfectly fine with things slowing down since it's already quite good and stands to be much better with a focus on polish and incremental improvements. I wouldn't invest in these AI companies though!
didibus 1 hours ago [-]
> As someone who knows they are benefitting from AI (study shmuddy)
XD
Look, I get it, I still use it, but you have to admit, people also think that various bogus home remedy totally helps them get over a cold faster. There's absolutely a possibility it in no way makes us faster.
Now, you did say "benefit", that's more broad, and you implied things like polish, I've seen others mention it just makes the work easier, that could be a win in itself (for the workers). Maybe it's about accessibility. Etc.
I do think though, right now, we're all in the "home remedy" territory, until we actually measure these things.
rapind 10 minutes ago [-]
I consider myself to be something of a cynic. That’s not to say I couldn’t be bamboozled in some circumstances.. . but I think it unlikely is this case, because I am simply OK with AI being a complete failure if indeed it is. I don’t have an irrational need for this to work. If anything I was very skeptical to start with and was pleasantly surprised to find it useful.
I’m not pushing Amway, I don’t own any crypto, and I’m bearish on the S&P right now due the market cap concentration at the top. And yet I swear that claude code is working for me quite well.
> Now, you did say "benefit", that's more broad, and you implied things like polish, I've seen others mention it just makes the work easier, that could be a win in itself (for the workers). Maybe it's about accessibility. Etc.
Yes exactly, and this is the (ambiguous) metric that I actually care about. I suspect this study will go down in history as useless and flawed, not to be overly harsh :)
throitallaway 2 hours ago [-]
Sometimes it's like an autocomplete on steroids and reads my mind. Other times it suggests things that make no sense and completely gets in the way.
biophysboy 2 hours ago [-]
I sort of wonder if AI pushes people to stay "in the weeds". I've noticed that the speed of my software development hinges a lot on decisions that require me to take a step back (e.g. Am I using the right tool? Is this feature necessary? etc)
sdeframond 2 hours ago [-]
An interesting study indeed. Not enough data points but still way better than anecdata and self reporting!
rconti 2 hours ago [-]
I'm no AI apologist, but for one, this is how investments, particularly speculative investments, work. They're investing money now in the _hopes_ of a future return. It's pretty early days still. Secondly, of _course_ the huge AI players are doing everything they can to overpromise and convince corporations to throw cash at them to keep the party going.
I think the real problem is, it's just a bit too early, but every CEO out there dreams of being lauded for their visionary take on AI, and nobody wants to miss the bus. It's high-leverage tech, so if it (some day) does what it's supposed to do, and you miss making the investment at the right time, you're done.
Thrymr 1 hours ago [-]
It is rare that a third of the capital in the US stock market (by some recent estimates) is going for essentially the same speculative bet, though.
mushufasa 2 hours ago [-]
1 in 20 doesn't sound great. But you have to mediate that with
- everyone and their mother are doing a "generative ai program" right now, a lot of times just using the label to try to get their project funded, ai being an afterthought
- if the 1 out of 20 projects is game-changing, then you could argue right now people should actually be willing to spend even more on the opportunity, maybe the number should actually be 1 in 100. (The VC model is about having big success 1 in 10 times.)
- studies of ongoing business activities are inherently methodologically limited by the data available; I don't have a ton of confidence that these researchers' numbers are authoritative -- it's inherently impossible to truly report on internal R&D spend especially a private companies without inside information, and if you have the inside information you likely don't have the full picture.
spogbiper 2 hours ago [-]
Sounds like 95% of companies are potential clients for my consulting services
jryio 1 hours ago [-]
Started an entire consulting practice to get engineering teams and founders out of vibe coded pits. Even got a great domain for it - vibebusters
So far business is booming and clients are happy with both human interactions with senior engineers as well as a final deliverable on best practices for using AI to write code.
Curious to compare notes
nextworddev 2 hours ago [-]
Good thing consulting is one thing AI is decent at
uncircle 2 hours ago [-]
Do you sell overpriced generative AI solutions, or do you consult them on how to pivot away from idiotic generative AI?
ehutch79 2 hours ago [-]
YES!
theyinwhy 2 hours ago [-]
So much about can't have the cake and eat it
ehutch79 2 hours ago [-]
Sell them the cake and then get paid to eat it.
onlyrealcuzzo 2 hours ago [-]
Can't tell if you're in on it, but he's implying this waste is no different than average for consulting
uncircle 2 hours ago [-]
I take offence at comparing my consulting services writing real software by hand like we did in 2021 with generative AI spambots.
(I'm not really offended honestly. Startups will come crying to un-vibe the codebases soon enough.)
2 hours ago [-]
mannyv 2 hours ago [-]
My daughter, as an intern, created a whole set of prompts for a metal fab that extracted all the metal parts out of a CAD file (or PDF) and dimensions so it's easier for them to bid.
Saved them hours of work.
Of course, they didn't spend on "AI" per se.
Most people don't know how to meta their job functions, so AI won't really be worth it. And the productivity gains may not be measurable ie: "I did this in 5 minutes instead of 500, so I was able to goof off more."
chriskanan 1 hours ago [-]
Where is the actual paper that makes these claims? I'm seeing this story repeated all over today, but the link doesn't actually seem to go to the study.
I am not going to trust it without actually going over the paper.
Even then, if it isn't peer-reviewed and properly vetted, I still wouldn't necessarily trust it. The MIT study on AI's impact on scientific discovery that made a big splash a year ago was fraudulent even though it was peer reviewed (so I'd really like to know about the veracity of the data): https://www.ndtv.com/science/mit-retracts-popular-study-clai...
Workaccount2 54 minutes ago [-]
An alternative headline is "90% of employees report using LLMs regularly"
If everyone does it, then no one gets an advantage, because all see a productivity boost.
If you do not do it, you get left behind and cannot compete in the marketplace.
I took a business systems administration course like 20 years ago, and they knew this was the case. As far as we can tell it's always been the case.
IT doesn't create massive moats/margins because price competition erodes the gap. And yet if you do not keep up you lose.
It's definitely a boon for humanity though, in the industries where technology applies things have been very obviously getting much cheaper over time.
(Most notably American housing has been very very resistant to technological change and productivity gains, a part of the story why housing has gone way up) - https://youtu.be/VfYp9qkUnt4?si=D-Jpmojtn7zV5E8T
Havoc 36 minutes ago [-]
What I found most interesting about the stat is how media spun it. At a glance they're similar...but the closer you look the more you notice writers are playing it fast & loose:
>only 5 percent of custom enterprise AI tools reach production
>95% of Enterprise AI Pilots Fail to Boost Revenues
>Why are 95% of GenAI pilot projects failing?
>95% of Companies See 'Zero Return'
>5% of integrated AI pilots are extracting millions in value
>95% of generative AI implementations in enterprise 'have no measurable impact on P&L'
aqme28 2 hours ago [-]
This is why it's so good to sell shovels, so-to-speak.
In this case, that's NVDA
lgats 2 hours ago [-]
or steel and wood for making shovels, TSMC
tovej 2 hours ago [-]
The shovel business is good as long as the gold rush lasts. Once the gold rush is over, you're going to have to deal with a significant decrease in volume, unless you can find other customers.
Crypto's over, gaming isn't a large enough market to fill the hole, the only customers that could fill the demand would be military projects. Considering the arms race with China, and the many military applications of AI, that seems the most likely to me. That's not a pleasant thought, of course.
The alternative is a massive crash of the stock price, and considering the fact that NVIDIA makes up 8% of everyone's favorite index, that's not a very pleasant alternative either.
It seems to me that an ultra-financialized economy has trouble with controlled deceleration, once the hypetrain is on it's full-throttle until you hit a wall.
taormina 2 hours ago [-]
There aren’t enough GPUs for average gamers to buy anything vaguely recent and they would love to be able to. Making the best GPUs on the planet is still huge and the market is quite large. Scalping might finally die at this rate, but NVDA wasn’t making any of the scalping money anyway so who cares? Data centers and gamers still need every GPU NVDA can make.
tovej 1 hours ago [-]
Oh there's definitely a market but it's not as big or worth as much the AI market. Gamers don't need GB200s or H100s, and AMD beats Nvidia on price in most segments. Nvidia isn't going to die, but gamers won't fill the demand.
Data centers might, but then they'll need something else to compute, and if AI fails to deliver on the big disruptive promises it seems unlikely that other technologies will fill those shoes.
I'm just saying that something big will have to change, either Nvidias story or share price. And the story is most likely to pivot to military applications.
Aperocky 57 minutes ago [-]
If there are zero return, then something is not being done right.
AI had led to significant operational improvements. The speed of root causing has increased by multiple times and the time spent head-desking or chasing leads have reduced significantly - even if the AI is wrong the first few times, it is recognizable that it would sometime be a rabbithole that I myself were likely to spend time - a lot longer time in.
It obviously cannot do these things by itself because it can arrive at the wrong conclusion all the time and be pretty stubborn about it, but at the same time to say there are no benefit is like throwing a bike away because you fell riding it the first time and go back to walking.
cstejerean 2 hours ago [-]
Why do I have a feeling that most of that $30B was spent on paying for consultants, most of which were also essentially making things up as they went along.
wood_spirit 2 hours ago [-]
Are there really that many consultants floating around still? I remember the heyday of the 2000s and kind of thought the outside consultants were largely disappearing and nowadays companies have trained themselves to jump on the hype train without paying consultants to push them?
bilbo0s 2 hours ago [-]
I don't know?
For some reason, I'm thinking most of the money went to either inferencing costs or NVidia.
AstroBen 2 hours ago [-]
The link to their referenced study doesn't seem to work?
This is confusing.. it's directly saying AI is improving employee productivity, but that's not leading to more business profit... how does that happen?
addaon 2 hours ago [-]
> AI is improving employee productivity, but that's not leading to more business profit... how does that happen?
One trivial way is that the increase of productivity is less than the added cost of the tools. Which suggests that (either due to their own pricing, or just mis-judgement) the AI companies are mis-pricing their tools. If the tool adds $5000 in productivity, it should be priced at $4999, eventually -- the AI companies have every motivation to capture nearly all of the value, but they need to leave something, even if just a penny, for the purchasing company to motivate adoption. If they're pricing at $5001, there's no motivation to use the tool at all; but of course at $4998 they're leaving money on the table. There's no stable equilibrium here where the purchasing companies end up with a /significant/ increase in (productivity - cost of that productivity), of course.
bilbo0s 2 hours ago [-]
the AI companies are mis-pricing their tools
Sounds like the AI companies are not so much mispricing, as the companies using the tools are simply paying wayyy too much for the privilege.
As long as the companies keep paying, the AI companies are gonna keep the usage charges as high as possible. (Or at least, at a level as profitable to themselves as possible.) It's unreasonable to expect AI companies to unilaterally lower their prices.
AznHisoka 2 hours ago [-]
Maybe they’re measuring productivity by flawed metrics. One could write 10x as much code, but that doesnt mean that will equate to more profit
macintux 2 hours ago [-]
Possible conclusion: most of the work that employees do has no direct impact on earnings.
rogerkirkness 3 hours ago [-]
AI is a deflationary technology, sort of like how most of the cost of TVs is the cost of shipping them from where they are made to where they are used. So wouldn't the returns show up in 'less work needs to be done' slowly over time?
lazide 2 hours ago [-]
Except generative AI is mostly an arms race, with content being generated being processed by some other AI on the other side. And any humans unfortunate enough to be in the middle regretting being born.
yalogin 58 minutes ago [-]
This is looking more like an enterprise cycle instead of a consumer one. I may be a bit jaded or over simplifying things, but I see this AI cycle as redoing enterprise workflows using AI. Not all of them will benefit from it and even then the incremental benefit or ROI from switching to an genAI automation is being questioned here.
GloriousMEEPT 2 hours ago [-]
One thing I ponder after remembering previous tech booms, is that they often left behind something extremely valuable (fiber). With all the GPU datacenters rolling out, what next bubble can take advantage of this boon, if it is one?
eulgro 2 hours ago [-]
It will probably all be used to mine bitcoin.
OutOfHere 1 hours ago [-]
For many years now, Bitcoin is mined only with custom ASICs, no longer with GPUs.
ApeWithCompiler 2 hours ago [-]
For the future I will relabel "AI" as "Ain't Interested".
But despite the missing return of investment, it only needs a manager or a few, invested enough. They will push layoffs and restructuring regardless of better advice.
wheelerwj 2 hours ago [-]
My guess is that this title could also be written as, “The value of AI projects are being captured by just 5% of companies.”
It’s pretty clear to anyone who’s using this technology that it’s significant. Theres still tons to work out and the exact impact is still unknown. But this cat isn’t going back in the bag.
LPisGood 1 hours ago [-]
> It’s pretty clear to anyone who’s using this technology that it’s significant
I disagree entirely. It’s neat, and it’s a marginal improvement over current-year google, but significant is an overstatement.
nromiun 1 hours ago [-]
Well, companies were able to raise that 30B on AI hype. The executives and AI hardware companies got their cut. Shareholders got inflated share price by just shouting AI, AI. What more do you want?
bishnu 1 hours ago [-]
The illustrative example to me is the construction of railroads at the beginning of the Industrial revolution. Unquestionably useful, unquestionably important there were nevertheless at least 3 speculative financial bubbles centered on railway construction.
1 hours ago [-]
morelandjs 2 hours ago [-]
I’d have to guess that new startup founders are building leaner teams leading to slower burn rates and longer run ways. That has value. I think share skepticism is warranted is whether or not old behemoths can retrofit AI efficiency gains into their existing organizational structures.
lacy_tinpot 2 hours ago [-]
Why do people so desperately want to see AI fail?
lkramer 2 hours ago [-]
I think people want to see the mindless application of crappy LLM chatbots and AI summaries everywhere fail. At least that's my position. I would also like the notion that "development can be sped up 6-700% by applying AI" would go away.
rusted1 2 hours ago [-]
We dont. People just tired of listening to the pipe dreams of these weirdos CEOs who sell AI.
lacy_tinpot 1 hours ago [-]
We do this every round for every new technology.
Here's the truth: NO ONE KNOWS.
What part of No One Actually Knows do people not understand? This applies to both the "AI WILL RULE THE WORLD MUAHAHA" and "AI is BIG BIG HOAX" crowd.
recallingmemory 2 hours ago [-]
Humans like knowing their next paycheck is safe. AI is a disruption to that feeling of security.
jurking_hoff 2 hours ago [-]
It’s not just that. It’s just that the mask is off and now the intent is stated clear: we are going to strip you of all your security and leave you high and dry.
exasperaited 1 hours ago [-]
[removed]
lacy_tinpot 1 hours ago [-]
The real evil people here are the "artists" that have this kind of attitude. I still remember the "Digital Art isn't real Art" people, which btw still exist among your snobby "Fine Arts" crowds.
I think we should actually ban all digital art platforms, no Photoshop, no special effects, all hand drawn. And I'll use some weird weaponized empathy calling out for the human soul and human creativity.
What a toxic bunch.
exasperaited 33 minutes ago [-]
[removed]
lacy_tinpot 16 minutes ago [-]
Now you bring out the straw man as a defense? How about you use that idea for what you're typing as well.
You're not standing up for art and culture. You're not asking for a "little reflection". You are however just being a cynic. And cynicism is toxic. It's bad for health. It's a weird affliction. Worse it's actively harmful to society.
Optimism is better. Tools that create abundance are better. Managing scarcity is dystopian, and ultimately harmful. It's a mindset that needs to be purged. Creating abundance is a far superior mindset.
jurking_hoff 10 minutes ago [-]
> Worse it's actively harmful to society.
Let me tell you what “actively harmful” for society is
Actively harmful to society is building platforms that extract value through the reward and encourages of antisocial behavior on a scale before.
> Managing scarcity is
Managing scarcity is reality. It’s strange how capitalists have suddenly thrown out the concept in the favor a “Muh Star Trek future is here” all because of some impressive chatbots.
> Creating abundance is a far superior mindset.
What abundance has been created besides the horde of garbage and slop that is
Or perhaps you simply mean the abundance of money thrown around gamblers?
snozolli 2 hours ago [-]
Because it's going to destroy knowledge work and the entire middle class.
lacy_tinpot 2 hours ago [-]
I thought we hated doing the menial office jobs though.
snozolli 2 hours ago [-]
Maybe spoiled brats did. I like "menial" office jobs a lot more than starving or doing humiliating acts on OnlyFans to pay my bills, which is the future we're barreling toward.
What menial about knowledge work, anyway?
taormina 2 hours ago [-]
Why do people so desperately want to see AI succeed? The financial investment explains it for some.
lacy_tinpot 2 hours ago [-]
Why do want a technology that's able to automate human drudgery? Feels like an absurd question.
jurking_hoff 2 hours ago [-]
Because it doesn’t. The jobs that will be left will be exclusively drudgery.
lacy_tinpot 1 hours ago [-]
It doesn't because we already know it doesn't because just like think about it and it clearly doesn't. So everyone trying to make the technology is a big dumb dumb for even trying.
Like is the conclusion we shouldn't even try? This kind of thinking ridiculous.
jurking_hoff 8 minutes ago [-]
> Like is the conclusion we shouldn't even try? This kind of thinking ridiculous.
Same reason why we aren't looking for innovative ways of using sugary soft drinks as a building material. Just because there's a non-zero chance we could come up with something isn't compelling enough by itself.
21 minutes ago [-]
os2warpman 2 hours ago [-]
TIL 5% of companies are lying about their financials. (or 5% of companies are selling the pickaxes to the unfortunate miners)
seydor 2 hours ago [-]
I'm confused, which point of the hype cycle is this?
jameskilton 2 hours ago [-]
The Trough of Disillusion. Though we definitely aren't down there yet, it's coming.
m_fayer 2 hours ago [-]
You just wait, when we achieve agi it’ll drive hype cycles at a speed humans have no chance of keeping up with. We will always effectively be in all points of the hype cycle at once. Money will wash randomly through the economy like soap through your laundry. You ain’t seen nothing yet.
dr_dshiv 1 hours ago [-]
Man, if people aren’t on the plateau of productivity by now with genai, it’s their own damn fault. Trough of disillusionment was like, 2020 AI projects.
draw_down 2 hours ago [-]
[dead]
stephankoelle 2 hours ago [-]
In some products, certain AI features have become expected. If a product doesn’t include them, it risks losing customers, making it a net negative for the market. At this point, companies either invest in AI or risk falling behind.
Scarblac 2 hours ago [-]
Can you name some of those products?
TrackerFF 2 hours ago [-]
I'm guessing every VC firm out there is hoping that they'll be the ones that picked the winners - the AI companies that will rule the world in N years.
The report itself can only be viewed by filling out a form - this article is so details light as to be useless.
czhu12 52 minutes ago [-]
As much as I also want to jump on the ai bubble bandwagon (who doesn’t love a good bit of pessimism), I’m still mind blown daily by how these models operate.
I recently ported over a fastapi app to Django with Claude code and it was at least twice as fast as I probably would’ve been able to do myself, and I only had to somewhat pay attention. What would’ve been a pretty intense few days turned into about 2 hours of mindless work while tens of thousands of lines were ported over, tested, and refactored
manishsharan 2 hours ago [-]
I am not interested in the results of the 95%.
I want to know more about the 5% who got it right. What are their use cases ?
foxfired 2 hours ago [-]
Atlassian added this new AI feature to create audio summaries of Confluence documents. It's really impressive. No one reads documentation. I know it because I'll send someone a documentation, then they'll ask me a question that is answered in the second paragraph, meaning they didn't read it.
So their feature is not just text to speech, but a reading of a summarized version of the articles. But here is the problem. The documentation has no fluff. You don't want a summary, you want the actual details. When you are reading the document that describes how the recovery fee is calculated, you want to know exactly how it is calculated.
I've ran it on multiple documents and it misses key information. An unsuspecting user might take it at face value. So this feature looks impressive, but it misses the entire point of documentation. Which is *preserving the details*.
swader999 40 minutes ago [-]
AI reads the docs. For the first time in twenty years I'm actually advocating for more documentation.
ozgung 1 hours ago [-]
So, on August 21, 2025, everyone in the world suddenly decided that AI was a bubble, no matter what day had said the day before. And they decided to start a PR campaign to announce this to the general public. CEOs were overselling, now they are underselling (their job is to sell).
What's going on? I find all of these pretty sus.
kingstnap 2 hours ago [-]
I mean of course they aren't. So many people are offering so much stuff for free or at a pittance.
I honestly don't think it matters though. Feel free to disagree with me but I think the money is irrelevant.
The only thing that actually matters is the long run is the attention, time, and brain space of other people. After all that's where fiat currency actually derives it's value. These Gen AI companies have captured a lot of that extremely quickly.
OpenAI might have "burned" billions but they way they have wrung themselves into seemingly every university student's computer, every CEOs mind, the policy decisions of world leaders, ever other hackernews post, is nothing short of miraculous...
dsr_ 2 hours ago [-]
"I'm sure my company is among the 5% of superusers," said CEO Chad "Bro" Brochad, who later admitted he had not checked.
IMO 'Zero return' is really just saying that these companies never had a plan for their AI implementation in the first place, and so never had any idea what they were trying to achieve and measure in the first place.
The article does call out clear issues companies have with AI workflows etc. and those are likely real problems, but if you're saying *zero* return those aren't the root cause problems.
phplovesong 2 hours ago [-]
No shit. AI is not "AI", its just a meme word for getting VC cash, because the original term (ML) did not raise any capital. Play stupid games, win stupid prizes.
paul7986 1 hours ago [-]
Present day AI (chatGPT 5) is not a complete graphic/web designer or front-end developer. My job and skills are safe from it as it can't...
- Make edits to the solid looking logos & web designs it spits out. Instead it creates brand new logos and designs ..not what i asked it to do!
- Front end code it doesn't spit a zip file with all the images. It does speed up my design/development (I use code to design) process where I use to design/develop in the browser using a bootstrap template.
Maybe it will finally figure it out and or maybe its just the Wizard of Oz .. a facade that grabs designs off the web and mixes some up but can never make edits.
pjmlp 2 hours ago [-]
I can hardly wait for the bubble to burst, now everyone is forced to meet OKRs on using AI at work, while at the same time, being forbidden to use the data that actually makes the tools usable.
turnsout 2 hours ago [-]
The AI wave is so reminiscent of the early days of the internet. Right now we're in about 1999, a time of tremendous hype. Business that weren't even in tech were wondering how much they need to know about the HTTP spec. People from the marketing team were being pulled off their print ad campaigns, given a copy of Dreamweaver, and told to make the company website. We hadn't even figured out the roles.
It 100% turned out to be a bubble and yet, if anything, the internet was under-hyped. The problem in 1999 was that no one really knew how it was going to play out. Which investments would be shrewd in retrospect, and which ones would be a money pit?
When an innovation hits, it takes time to figure out whether you're selling buggy whips, or employing drivers who can drive any vehicle.
Plenty of companies sunk way too much money into custom websites back in 99, but would we say they were wrong to do it? They may have overspent at a time when a website couldn't justify the ROI within 12 months, but how could they know? A few short years later, a website was virtually required for every business.
So are companies really seeing "zero return" on their AI spend, or are they paying for valuable lessons about how AI applies to their businesses? There may be zero ROI today, but all you need to do is look at the behavior of normal people to see that AI is not going anywhere. Smart companies are experimenting.
llmllmkom 1 hours ago [-]
The internet is about sharing actual information. LLMs can't share real information, just a flimsy derivative. It's in their DNA. How so many supposedly smart people who understand machine learning and its fundamental entropy failed to acknowledge this reality from day zero, is beyond me.
turnsout 54 minutes ago [-]
Not every revolution is about sharing information. I have plenty of healthy skepticism of companies adding "AI sauce" to everything, but if you can't see the genuine utility of LLMs, it might just be a failure of imagination.
ChrisArchitect 2 hours ago [-]
Earlier discussion on the report:
AI is predominantly replacing outsourced, offshore workers
Can you really put a price on a good excuse for layoffs that deflects rancor from employees while sounding like positive news to investors? :p
... Well, probably yes, but I don't have the data to do it.
revskill 2 hours ago [-]
Because they are doom to fail.
m3kw9 2 hours ago [-]
Is hard to measure return when you prompt a question and it comes out with a partial solution.
justinator 2 hours ago [-]
It HAS lead to widespread enshitification of products and services though!
2 hours ago [-]
seatac76 2 hours ago [-]
Ed Zitron was right all along eh.
mwkaufma 1 hours ago [-]
Levels of cope in this thread many previously thought impossible.
varispeed 2 hours ago [-]
The funniest part isn’t that AI hasn’t delivered profits. It’s that the only “value” most people got from LLMs was accidentally rediscovering what Google used to be before it turned into an ad-riddled casino.
Executives mistook that novelty for a business revolution. After years of degraded search, SEO spam, and “zero-click” answers, suddenly ChatGPT spat out a coherent paragraph and everyone thought: my god, the future is here. No - you just got a glimpse of 2009 Google with autocomplete.
So billions were lit on fire chasing “the sliced bread moment” of finally finding information again - except this time it’s wrapped in stochastic parroting, hallucinations, and a SaaS subscription. The real irony is that most of these AI pilots aren’t “failing to deliver ROI” - they’re faithfully mirroring the mediocrity of the organisations deploying them. Brittle workflows meet brittle models, and everyone acts surprised.
The pitch was always upside-down. These things don’t think, don’t learn, don’t adapt. They remix. At best they’re productivity duct tape for bored middle managers. At worst they’re a trillion-dollar hallucination engine being sold as “strategy.”
The MIT study basically confirms what was obvious: if you expect parrots to run your company, you get birdshite for returns.
xyst 2 hours ago [-]
I am not surprised at all. The capabilities of AI have been vastly oversold. The only winners here are the billionaires that have had early equity/investment in genAI.
They got a majority of the country hooked into AI without truly understanding its current limitations. This is just like digital currency bubble/fad that popped a couple of years ago.
What most companies got out of it is a glorified chatbot (ie, something that was possible in 2014…) at 1000X the cost.
What a sad joke. Innovation in this country is based on a lie, fueled by FOMO.
Here's the source report, not linked to by this content farm's AI-written article: https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Bus...
The story there is very different than what's in the article.
Some infos:
- 50% of the budgets (the one that fails) went to marketing and sales
- the authors still see that AI would offer automation equaling $2.3 trillion in labor value affecting 39 million positions
- top barriers for failure is Unwillingness to adopt new tools, Lack of executive sponsorship
Lots of people here are jumping to conclusions. AI does not work. I don't think that's what the report says.
Wow, that is crazy. There's 163 million working Americans, that's close to a quarter of the workforce is at risk.
Well...
"It is difficult to get a man to understand something when his salary depends upon his not understanding it"
Here's a relatively straightforward application of AI that is set to save my company millions of dollars annually.
We operate large call centers, and agents were previously spending 3-5 minutes after each call writing manual summaries of the calls.
We recently switched to using AI to transcribe and write these summaries. Not only are the summaries better than those produced by our human agents, they also free up the human agents to do higher-value work.
It's not sexy. It's not going to replace anyone's job. But it's a huge, measurable efficiency gain.
There, I've saved you more millions.
I'm not going to say every project born out of that data makes good business sense (big enough companies have fluff everywhere), but ime anyway, projects grounded to that kind of data are typically some of the most straight-forward to concretely tie to a dollar value outcome.
We are not running a call centre ourselves but we are a SaaS offering the services for call centre data analysis.
Two things _would_ surprise me, though:
- That they'd integrate it into any meaningful process without having done actual analysis of the LLM based perf vs their existing tech
- That they'd integrate the LLM into a core process their department is judged on knowing it was substantially worse when they could find a less impactful place to sneak it in
I'm not saying those are impossible realities. I've certainly known call center senior management to make more hairbrained decisions than that, but barring more insight I personally default to assuming OP isn't among the hairbrained.
Instead of doing any of those (we have the infrastructure to do it) we are paying OpenAI for their embeddings APIs. Perhaps openAI is just doing old school ML under the hood but there is definitely an instinct among product managers to reach for shiny tools from shiny companies instead of considering more conservative options
I'm not saying any given department should, by some objective measure, switch to LLMs and I actually default to a certain level of skepticism whenever my department talks about applications.
I'm just saying I can imagine plausible realities where an intelligent and competent person would choose to switch toward using LLMs in a call center context.
There are also a ton of plausible realities where someone is just riding the hype train gunning for the next promotion.
I think it's useful to talk about alternate strategies and how they might compare, but I'm personally just defaulting to assuming the OP made a reasonable decision and didn't want to write a novel to justify it (a trait I don't suffer from, apparently), vs assuming they just have no idea what they're doing.
Everyone is free to decide which assumed reality they want to respond to. I just have a different default.
Quick and accurate routing and triage of inbound calls may be more fruitful and far easier than summarizing hundreds of hours of "ok now plug the router back into the wall." Im imagining AI identifying a specific technical problem that sounds a lot like a problem that a specific technician successfully solved previously.
1) my call is very important to them (it's not)
2) listen carefully because options changed (when? 5 years ago?)
3) they have a website where I can do things (you can't, otherwise why would I call?)
4) please stay at the end of call to give them feedback (sure, I will waste more of my time)
But in fact, customer call centers tend not to be able to even know that you called in yesterday, three days ago and last week.
This is why email-ticketing call centers are vastly superior.
Nor what you told the person you talked to three minutes earlier, during the same call, before they transferred you to someone else. Because their performance is measured on how quickly they can get rid of you.
It is our problem that needs fixing, so we can just wait untill either they redirect us to the right person with the right knowledge who might be one of the higher ups in the call centers. Or we just quit the call. Either way, it doesn't matter to the company.
Plus points that they don't have to teach the frontline customer service more details too and it could be easier for them to onboard new people / fire old employees. Also they would have to pay less if they require very low specifications.
man I remember the is 0.001 cent = 0.001 $ video /meme of verizon
https://www.youtube.com/watch?v=nUpZg-Ua5ao
Edit: Tell me more how preemptively spending five figures to transcribe and summarize calls in case you might want to do some "data engineering" on it later is a sound business decision. What if the model is cheaper down the road? YAGNI.
The parent states:
>Not only are the summaries better than those produced by our human agents...
Now, since they have not mentioned what it took to actually verify that the AI summaries were in fact better than the human agents, I'm sceptical they did the necessary due dillengence.
Why do I think this? Because I have actually tried to do such a verification. In order to verify that the AI summary is actually correct you have to engage in the incredibly tedious task of listening to original recording literally second by second and make sure that what is said does not conflict with the AI summary in question. Not only did the AI summary fail at this test, it failed in the first recording I tested.
The AI summary stated that "Feature x was going to be in Release 3, not 4" whereas the in the recording it is stated that the feature will be in Release 4 not 3, literally the opposite of what the AI said.
I'm sorry but the fact that the AI summary is nicely formatted and has not missed a major topic of conversation means fuck all if the details that are are discussed are spectacularly wrong from a decision tracking perspective, as in literally the opposite of what is stated.
And I know "why" the Ai summary fucked up, because in that instance the topic of conversation was about how there was some confusion about which release that feature was going to be in, that's why the issue was a major item of the meeting agenda in the first place. Predicably, the AI failed to follow the convoluted discussion and "came to" the opposite conclusion.
In short, no fucking thanks.
To give you an idea: Phonetic transcription was the "state of the art" when I was a QA analyst. It broke call transcripts apart into a stream of phonemes and when you did a search, it would similarly convert your search into a string of phonemes, then look for a match. As you can imagine, this is pretty error prone and you have to get a little clever with it, but realistically, it was more than good enough for the scale we operated at.
If it were an ecom site you'd already know the categories of calls you're interested in because you've been doing that tracking manually for years. Maybe something like "late delivery", "broken item", "unexpected out of stock", "missing pieces", etc.
Basically, you'd have a lot of known context to anchor the llms analysis, which would (probably) cover the vast majority of your calls, leaving you freed up to interact with outliers more directly.
At work as a software dev, having an LLM summarize a meeting incorrectly can be really really bad, so I appreciate the point you're making, but at a call center for an f500 company you're looking for trends and you're aware of your false positive/negative rates. Realistically, those can be relatively high and still provide a lot of value.
Also, if it's a really large company, they almost certainly had someone validate the calls, second-by-second, against the summaries (I know because that was my job for a period of time). That's a minimum bar for _any_ call analysis software so you can justify the spend. Sure, it's possible that was hand-waved, but as the person responsible for the outcome of the new summarization technique with LLMs, you'd be really screwing yourself to handwave a product that made you measurably less effective. There are better ways to integrate the AI hype train into a QA department than replacing the foundation of your analysis, if that's all you're trying to do.
I almost have this gut feeling that its the case (I may be wrong though)
Like imagine this, if the agent could just spend 3 minutes writing a summary, why would you use AI to create a summary and then have some other person listen to the whole audio recording and check if the summary is right
like it would take an agent 3 minutes out of lets say a 1 hour long conversation / (call?)
on the other hand you have someone listen to 1 hour whole recording and then check the summary? that's now 1 hour compared to 3 minutes Nah, I don't think so.
Even if we assume that multiple agents are contacted in the same call, they can all simply write the summary of what they did and to whom they redirected and just follow that line of summaries.
And after this, I think that your summary of seeing that they are really screwing away is accurately true.
Kinda funny how the gp comment was the first thing that I saw in this post and how even I was kinda convinced that they are one of the more smarter ones integrating AI but your comment made me come to realization of them actually just screwing themselves.
Imagine the irony, that a post about how AI companies are screwing themselves by burning a lot of money and then the people using them don't get any value out of it.
And then the one on Hn that sounded like it finally made sense for them is also not making sense... and they are screwing over themselves.
The irony is just ridiculous. So funny it made me giggle
I'm basically inferring how this would go down in the context I worked under, not the GP, because I don't know the details of their real context.
I think I'm seeing where I'm not being as clear as I could, though.
I'm talking about the lifecycle of a methodology for categorizing calls, regardless of whether or not it's a human categorizing them or a machine.
If your call center agent is writing summaries and categorizing their own calls, you still typically have a QA department of humans that listen to a random sample of full calls for any given agent on a schedule to verify that your human classifiers are accurately tagging calls. The QA agents will typically listen to them at like 4x speed or more, but mostly they're just sampling and validating the sample.
The same goes for _any_ automated process you want to apply at scale. You run it in parallel to your existing methodology and you randomly sample classified calls, verifying that the results were correct and you _also_ compare the overall results of the new method to the existing one, because you know how accurate the existing method is.
But you don't do that for _every_ call.
You find a new methodology you think is worth trying and you trial it to validate the results. You compare the cost and accuracy of that method against the cost and accuracy of the old one. And you absolutely would often have a real human listen to full calls, just not _all_ of them.
In that respect, LLMs aren't particularly special. They're just a function that takes a call and returns some categories and metadata. You compare that to the output of your existing function.
But it's all part of the: New tech consideration? -> Set up conditions to validate quantitatively -> run trials -> measure -> compare -> decide
Then on a schedule you go back and do another analysis to make sure your methodology is still providing the accuracy you need it to, even if you haven't change anything
It just has to be as good as a call center worker with 3-5 minutes working off their own memory of the call, not as good as the ground truth of the call. It's probably going to make weirder mistakes when it makes them though.
You're free to believe that of course, but you're assuming the point that has to be proven. Not all fuck ups are equal. Missing information is one thing, but writing literally opposite of what is said is way higher on the fuck up list. A human agent would be achieving an impressive level of incompetence if they kept on repeating such a mistake, and would definately have been jettisoned from the task after at most three strikes (assuming someone notices). But firing a specific AI agent that repeats such mistakes is out of the question for some reason.
Feel free to expand on why no amount of mistakes in AI summaries will outweigh the benefits in call centers.
But the solution isn't to use AI instead of not trusting the agents / customer service rep because their performance is graded on how quickly they can start talking to next
The solution is to change the economics in the way that the workers are incentivized to write good summaries, maybe paying them more and not grading them in such a way will help.
I am imagining some company saying AI is good enough because they themselves are using the wrong grading technique and AI is best option in that. SO in that sense, AI just benchmarked maxxed in that if that makes sense. Man, I am not even kidding but I sometimes wonder how economies of scale can work so functionally different from common sense. Like it doesn't make sense at this point.
Why OPUS though? There's dedicated audio codecs in the VoiP/telecom industry that are specifically designed for the best size/quality for voice call encoding.
Opus is great for a lot of things and realtime speech over sip or webrtc is just one.
Still, it's based on ideas from those earlier codecs of course :)
Seriously not kidding but the more I read these comments, the more I become horrified realizing wtf,The only reason I can think of integrating AI is because you wish to integrate AI. Nothing wrong with that, But unless proven otherwise through some benchmarks there is no way to justify AI.
So its like an experiment, they use AI and if it works/ saves time, great If not, then time to roll it.
But we do need to think about experiments logically and the way I am approaching it, its maybe good considering what customer service is now but man that's such a low standard that as customers we shouldn't really stand it. Call centres need to improve period. AI can't fix it. Its like man, we can do anything to save some $ for the shareholders. Only to then "invest" it proudly into AI so that they can say they have integrated AI and so they can have their valuations increased since VC's / stock market reacts differently to the sticker known as AI
man.. so saying that you use AI, should be a negative indicator instead of a positive one in the market and the whole bubble is gonna come crashing down when people realize it.
It physically hurts me now thinking about it once again. This loop of making humans bad for money, using that money for inferior product, using that inferior product only because you want AI sticker, because shareholders want valuation increase and the company is willing to do this all because they feel/ are rewarded for this by people who will buy anything AI related thinking its gold or maybe that more people will buy it from them at an even higher evaluation because AI sticker and so on..
Almost sounds like a pyramid.
They are hilariously inaccurate. They confuse who said what. They often invert the meaning "Joe said we should go with approach x" where Joe actually said we should not do X. It also lacks context causing it to "mishear" all of our internal jargon to "shit my iPhone said" levels.
This gets to a common misconception when it comes to GenAI uses: it functions best as “augmented intelligence” rather than “artificial intelligence”. Meaning that it’s at its best when there’s still a human in the loop and the AI supplements the parts the person are bad at rather than replacing the person entirely. We see this with coding, where AI is very good at writing scaffolding, large-scale refactoring, picking decent libraries, reading API docs and generating code that calls it appropriately, etc but still needs a human to give it very specific directions for anything subtle, and someone to review carefully for bugs and security holes.
1. Would you recommend us?
2. Was the agent helpful?
I have a friend who used to work at a call centre and would routinely get the lowest marks on the first item and the highest on the second. I do that when the company has been shitty but I understand the person on the line really made an effort to help.
Obviously, those ratings go back to the supervisor and matter for your performance reviews, which can make all the difference between getting a raise or being fired. If anything, call centre employees have a lot of incentive to do a good job if they have any intention of keeping it, because everything they do with a customer is recorded and scrutinised.
Of course, we can just rely on knowing nothing just to look things up, but I want more for thinking peoples.
I'm finding that the summarization of individual meetings very useful, I'm also finding that the ability to send in transcripts across meetings, departments, initiatives whatever to be very effective at surfacing subtexts and common pain points much more effectively than I can.
I'm also using it to look at my own participation in meetings to help me see how I interact with others a (little) bit more objectively and it has helped me find ways to improve. (I don't take its advice directly lol, just think about observations and determine myself if it's something that's important and worth thinking about)
Is there some training you applied or something specific to your use case that makes it work for you?
When was the last time you called a large company and the person answering was already across all the past history without you giving them a specific case number?
Why were they doing this at all? It may not be what is happening in this specific case but a lot of the AI business cases I've seen are good automations of useless things. Which makes sense because if you're automating a report that no one reads the quality of the output is not a problem and it doesn't matter if the AI gets things wrong.
In operations optimization there's a saying to not go about automating waste, cut it out instead. A lot of AI I suspect is being used to paper over wasteful organization of labor. Which is fine if it turns out we just aren't able to do those optimizations anyway.
It was equally frustrating when I, as a call center worker, had to ask the custmer to tell me what should already have been noted. This has required me to apologize and to do someone else's work in addition to my own.
Summarizing calls is not a waste, it's just good business.
It's 100% plausible it's busy work but it could also be for: - Categorizing calls into broad buckets to see which issues are trending - Sentiment analysis - Identifying surges of some novel/unique issue - Categorizing calls across vendors and doing sentiment analysis that way (looking for upticks in problem calls related to specific TSPs or whatever) - etc
False positives and negatives aren't really a problem once you hit a certain scale because you're just looking for trends. If you find one, you go spot-check it and do a deeper dive to get better accuracy.
Which is also how you end up with some schlepp like me listening to a few hundreds calls in a day at 8x speed (back when I was a QA data analyst) to verify the bucketing. And when I was doing it everything was based on phonetic indexing, which I can't imagine touching llms in terms of accuracy, and it still provided a ton of business value at scale.
Is it not, in the scenario you are describing? You are saying the agents are free now to do higher-value work. Why were there not enough agents before, especially if higher-value work was not done?
But that doesn’t mean AI is without its uses. We’re just in that painful phase where the hype needs to die down and we treat LLMs as what they really are; an interesting new tool in the toolkit that provides some new ways to solve problems. It’s almost certainly not going to turn into AGI any time soon. It’s not worth trillions. It’s certainly worth something, though.
I think the financials on developing new frontier models are terrible. But I’ve already built multiple AI projects for my company that are making money and we’ve got extremely happy customers.
Investors thought one company was going to win the AI Wars and make a quadrillion dollars. Instead it’s probably going to be 10,000 startups that will build interesting products based on AI, and training new models won’t actually be a good financial move.
Specifically: Do they spend more time actually taking calls now? I guess as long as you're not at the burnout point with utilization it's probably fine, but when I was still supporting call centers I can't count the number of projects I saw trying to push utilization up not realizing how real burnout is at call centers.
I assume that's not news to you, of course. At a certain utilization threshold we'd always start to see AHTs creep up as agents got burned out and consciously or not started trying to stay on good calls.
Guess it also partly depends on if you're in more of a cust serv call center or sales.
I hated working as an actual agent on the phones, but call center ops and strategy at scale has always been fascinating.
I think AI in general is just being misused to optimise local minima in detriment to the overall system.
I also would assume that there are far more significant behavioral or human factors that consume the time writing those minutes, i.e. an easy spot to kill 5-10 min before opening the line for the next inbound call, but the 5-10 minute break will persist anyway.
I fully believe AI will create a lot of value and is revolutionary, especially for industries where value is hidden within data. Its the pace of value creation that stands out to me (how long til its actually useful and better and creates more value than it costs??) but the bubble factor is not ignorable on the near term.
> It's not going to replace anyone's job
Mechanically, more efficiency means less people required for the same output.
I understand there is no evidence that any other sentence can be written about jobs. Still, you should put more text in between those two sentences. Reading them so close together creates audible dissonance.
Why can't it mean more output with the same number of people? If I pay 100 people for 8 hours of labor a day, and after making some changes to our processes, the volume of work completed is up 10% per day, what is that if not an efficiency gain? What would you call it?
It really depends on the amount of work. If the demand for your labor is infinite, or at least always more than you can do in a days work, efficiency gains won't result in layoffs, just more work completed per shift. If the demand for the work is limited, efficiency gains will likely result in layoffs because there's no point in paying someone who was freed up by your new processes to sit around twirling a pen all day.
I get that we're trying to look for positive happy scenarios, but only considering the best possible world instead of the most likely world is bias. It's Optimistic in the sense of Voltaire.
Unless we're claiming there is an intractable qualified labor shortage in call centers, this is always the result of a much simpler explanation: it's much cheaper to understaff call centers
A company that wants to save money by adding more AI is a company that cares about cost cutting. Like most companies.
The strategy that caused the company to understaff have not changed. The result is that we go back to homeostasis, and less jobs are needed to reach the same deliberate target.
Is that really millions of savings annually? Maybe it is but I always hesitate when a process change that saves one person a few minutes is extrapolated all the way out to dollars/year. What you'll probably see is the agents using those 3-5 minutes to check their phone.
This reminds me of the way juniors tend to think about things. That is, writing code is "the actual job" and commit messages, documentation, project tracking, code review, etc. are tedious chores that get in the way. Of course, there is no end to the complaints of legacy code bases not having any of those things and being difficult to work with.
The number of things I do in a day that half my coworkers see as a waste of time until they enjoy the outcomes is basically uncountable at this point.
If something is a “waste of time” it’s possible that you’re just lousy at it.
Self reflection is a rarer commodity than it should be. And most of the tasks you list either require or invite it.
With LLMs the risk is particularly hard to characterize, especially when it comes to adversarial inputs.
However I strongly doubt your point about "It's not going to replace anyone's job" and that "they also free up the human agents to do higher-value work". The reality in most places is that fewer agents are now needed to do the same work as before, so some downsizing will likely occur. Even if they are able to switch to higher-value work, some amount of work is being displaced somewhere in the chain.
And to be clear I'm not saying this is bad at all, I'm just surprised to see so many deluded by the "it won't replace jobs" take.
It's also disappointing that MIT requires you to fill out a form (and wait for) access to the report. I read four separate stories based on the report, and they all provide a different perspective.
Here's the original pdf before MIT started gating it: https://web.archive.org/web/20250818145714/https://nanda.med...
Did users knew that conversation was recorded?
This is a tiny fraction of all work done. This is work people were claiming to have solved 15 years ago. Who cares?
its likely a checkbox for compliance or some policy a middle manager put in place that is now tied to a kpi
And are full transcriptions not the better option?
We have someone using Firefly for note taking, and it's pretty bad. Frequently gets details wrong or extrapolates way too much from a one-off sentence someone said.
How do you verify these are actually better?
my guess was wrong but not really.
Imagine a human agent or AI summarises: “Customer accepted proposed solution.” Did they? Or did they say “I’ll think about it”? Those aren’t the same thing, but in the dashboard they look identical. Summaries can erase nuance, hedge words, emotional tone, or the fact the customer hung up furious.
If you’re running a call centre, the question is: are you using this text to drive decisions, or is it just paperwork to make management feel like something is documented? Because “we saved millions on producing inaccurate metadata nobody really needs” isn’t quite the slam dunk it sounds like.
Finally, who cares about millions saved (while considering the above introduced risk), when trillions are on the line?
AI today is terrible at replacing humans, but OK at enhancing them.
Everyone who gets that is going to find gains - real gains, and fast - and everyone who doesn't, is going to end up spending a lot of money getting into an almost irreversible mistake.
Now, summary, or original? (Provided the summary is intentionally vague to a fault, for arguments sake on my end).
It's a tad far-fetched in this specific scenario, but an AI summary that says something like "cancel the subscription for user xyz" and then someone else takes action on that, and XYZ is the wrong ID, what happens?
It’s all fun and games until the bean counters start asking for evidence of return on investment. GenAI folks better buckle up. Bumps ahead. The smart folks are already quietly preparing for a shift to ride the next hype wave up while others ride this train to the trough’s bottom.
Cue a bunch of increasingly desperate puff PR trying to show this stuff returns value.
"Hey, guys, listen, I know that this just completely torched decades of best practices in your field, but if you can't show me progress in a fiscal year, I have to turn it down." - some MBA somewhere, probably, trying and failing yet again to rub his two brain cells together for the first time since high school.
Just agentic coding is a huge change. Like a years-to-grasp change, and the very nature of the changes that need to be made keep changing.
I've been programming professionally for > 20 years and I intend to do it for another > 20 years. The tools available have evolved continually, and will continue to do so. Keeping abreast of that evolution is an important part of the job. But the essential nature of the role has not changed and I don't expect it to do so. Gen AI is a tool, one that so far to me feels very much like IDE tooling (autocomplete, live diagnostics, source navigation): something that's nice to have, that's probably worth the time, and maybe worth the money, to set up, but which I can easily get by without and experience very little disadvantage.
I can't see the future any more than anyone else, but I don't expect the capabilities and limitations of LLMs to change materially and I don't expect to be left in the dust by people who've learned to wrangle wonders from them by dark magics. I certainly don't think they've "torched decades of best practice in my field". I expect them to improve as tools and, as they do, I may find myself using them more as I go about my job, continuing to apply all of the other skills I've learned over the years.
And yes, I do have an eye-wateringly expensive Claude subscription and have beheld the wonders of Opus 4. I've used Claude Code and worked around its shitty error handling [1]. I've seen it one-shot useful programs from brief prompts, programs I've subsequently used for real. It has saved me non-zero amounts of time - actual, measurable time, which I've spent doodling, making tea and thinking. It's extremely impressive, it's genuinely useful, it's something I would have thought impossible a few years ago and it changes none of the above.
[1] https://github.com/anthropics/claude-code/issues/473
I mean, this is basically how all R&D works, everywhere, minus the strawman bit about "single fiscal year", which isn't functionally true.
And this is a serious career tip: you need to get good at this. Being able to break down extremely ambitious, many-year projects into discrete chunks that prove progress and value is a fundamental skill to being able to do big things.
If a group of very smart people said "give us ${BILLIONS} and don't bother us for 15 years while we cook up the next world-shaking thing", the correct response to that is "no thanks". Not because we hate innovation, but because there's no way to tell the geniuses apart from the cranks, and there's not even a way to tell the geniuses-pursuing-dead-ends from the geniuses-pursuing-real-progress.
If you do want to have billions and 15 years to invent the next big thing, you need to be able to break the project up to milestones where each one represents convincing evidence that you're on the right track. It doesn't have to be on an annual basis, but it needs to be on some cadence.
You really set yourself up with a nice glass house trying to make fun of the money guys when you are essentially just moving your own goal posts. It was annoying two (or three?) years ago when we were all talking about replacing doctors and lawyers, now it just cant help but feel like a parody of itself in some small way.
Agents may be good (I haven't seen it yet, maybe it's a skill issue but I'm not spending hundreds of dollars to find out and my company seems reluctant to spend thousands to find out) but they are definitely, definitely not general superintelligence like SamA has been promising
at all
really is sinking in
these might be useful tools, yes, but the market was sold science fiction. We have a useful supercharged autocomplete sold as goddamn positronic brains. The commentariat here perhaps understood that (definitely not everyone) but it's no surprise that there's a correction now that GPT-5 isn't literally smarter than 95% of the population when that's how it was being marketed
Now, I don’t believe this is an actual conspiracy, but rather a culture of hating the poor. The rich will jump on any endeavor—no matter how ridiculous—as long as the poor stay poor, even if they loose money in the process.
That said, technologies like this can also go through a rollercoaster pattern itself. Lots of innovation and improvement, followed by very little improvement but lots of research, which then explodes more improvements.
I think LLMs have a better chance at following that pattern than computer vision did when that hype cycle was all the rage
When GPT-5 came out, it wasn't going from GPT-4 to GPT-5. Since GPT-4 there has been: 4o, o1, o3, o3-mini, o4-mini, o4-mini-high, GPT-4.1, and GPT-4.5. And many other models (Llama, DeepSeek, Gemini, etc) from competitors have been released too.
We'll probably never experience a GPT-3.5 to GPT-4 jump again. If GPT-5 was the first reasoning model, it would have seemed like that kind of jump, but it wasn't the first of anything. It is trying to unify all of the kinds of models OpenAI has offered, into one model family.
...I'll try not to sound desperate tho.
"Spending on AI data centers is so massive that it’s taken a bigger chunk of GDP growth than shopping" - https://fortune.com/2025/08/06/data-center-artificial-intell...
We'll either see a new class of "AWS of AI" companies that'll survive and be used by everyone (that's part of the play Anthropic & OpenAI are making, despite API generating a fraction of their current revenue), or Amazon + Google + Microsoft will remain as the undisputed leaders.
idk what a person would do with a 6509 or a Sun Fire hah but they were all over craigslist iirc.
"Donald Trump and Silicon Valley's Billionaire Elegy" - https://www.wired.com/story/donald-trump-and-silicon-valleys...
"Secret White House spreadsheet ranks US companies based on loyalty to Trump" - https://www.telegraph.co.uk/business/2025/08/15/secret-white...
1. Generate content to create online influence. This is at this point probably way oversaturated and I think more sophisticated models will not make it better.
2. Replace junior developers with Claude Code or similar. Only sort of works. After all, you can only babysit one of these at a time no matter how senior you are so realistically it will make you, what, 50% more productive?
3. Replace your customer service staff. This may work in the long run but it saves money instead of making money so its impact has a hard ceiling (of spending just the cost of electricity).
4. Assistive tools. Someone to do basic analysis, double check your writing to make it better, generate secondary graphic assets. Can save a bit of money but can’t really make you a ton because you are still the limiting factor.
Aside: I have tried it for editing writing and it works pretty well but only if I have it do minimal actual writing. The more words it adds, the worse the essay. Having it point out awkward phrasing and finding missing parts of a theme is genuinely helpful.
5. AI for characters in video games, robot dogs, etc. Could be a brave new frontier for video games that don’t have such a rigid cause/effect quest based system.
6. AI girlfriends and boyfriends and other NSFW content. Probably a good money maker for a decade or so before authentic human connections swing back as a priority over anxiety over speaking to humans.
What use cases am I missing?
Billions get spent annually in administrative overhead focused on squeezing the most money out of these notes as possible. A tremendous expense can be justified to increase note quality (aka revenue, though 'accuracy/efficiency' is the trojan horse used to slip by regulators).
GenAI has a ton of potential there. Likewise on the insurance side, which has to wade through these notes and produce a very labor intensive paper trail of their own.
Eventually the AIs will just sling em-dashes at each other while we sit by pool.
If those prompts pop up constantly asking for elevated privileges, this is actually worse because it trains people to just reflexively allow elevation.
Sorry this is some bull. Either it works or it doesn’t.
How many hundreds of hours is your team spending to get there? What is the ROI on this vs investing that money elsewhere?
The thing is, you aren't contacting customer services because everything is going well, you are contacting them because you have a problem.
The last thing you need is to be gaslit by an AI.
The worst ones are the ones where you don't realise right away you aren't talking to a person, you get that initial hope that you've actually gotten through to someone who can help you (and really quickly too) only to have it dawn on you that you are talking to a ChatGPT wrapper who can't help you at all.
But if you're actually trying to provide good customer service because people are paying you for it any paying per case then you wouldn't dare put a phone menu or AI chat bot in-between them and the human. The person handles all the interaction with the client and then uses AI where it's useful to speed up the actual work.
We connect with slack/notion/code/etc so that you can do the following:
1. Ask questions about how your code/product works 2. Generate release notes instantly 3. Auto update your documentation when your code changes
We primarily rely on the codebase since it is never out of date
As for relying on the code base, that is good for code, although not for onboarding/deployment/operations/monitoring/troubleshooting that have manual steps.
I toyed with it and found it to be less frustrating to set up the latest layout for a VueJS project, but having it actually write code was… well I had to manually rewrite large chunks of it after it was done. I am sure it will improve but how long until you can tell it the specs, have it work for a few minutes or hours or days, and come back to an actual finished project? My bet is decades to never.
How much does that cost these days? Do you still have to fly to remote islands?
That means you expand from millions to billions of potential customers.
Arguably, it's not the tools fault when someone uses it incorrectly, but my aching brain does not care whose fault it is right now, nor do the shareholders care why productivity cratered after we got shiny new tools.
I don't know why everyone goes to "replacing". Were a bunch of computer programmers replaced when compilers came out that made writing machine code a lot easier? Of course not, they were more productive and accomplished a lot more, which made them more valuable, not less.
It is uniquely susceptible because the gaming market is well acclimated to mediocre writing and one dimensional character development that’s tacked on to a software product, so the improvements of making “thinking” improvisational characters can be immense.
Another revenue potential you’ve missed is visual effects, where AI tools allow what were previously labor intensive and expensive projects to be completed in much less time and with less, but not no, human input per frame
I mostly disagree. Every gaming AI character demo I've seen so far is just adds more irrelevant filler dialogue between the player and the game they want to play. It's the same problem that some of the older RPG games had, thinking that 4 paragraphs of text is always better than 1.
While people are doing their work, they don't think, "Oh man, I am really excited to talk with AI today, and I can't wait to talk with a chatbot."
People want to do their jobs without being too bored and overwhelmed, and that's where AI comes in. But of course, we cannot hype features; we sell products after all, so that's the state we are in.
If you go to Notion, Slack, or Airtable, the headline emphasizes AI first instead of "Text Editor, Corporate Chat etc".
The problem is that AI is not "the thing", it is the "tool that gets you to the thing".
Too many companies are just trying to spoon AI into their product somehow, as if AI itself is a desired feature, and are forgetting to find an actual user problem for it to actually solve.
In reality, AI sparkles and logos and autocompletes are everywhere. It's distracting. It makes itself the star of the show instead of being a backup dancer to my work. It could very well have some useful applications, but that's for users to decide and adapt to their particular needs. The ham-fisted approach of shoving it into every UI front-and-center signals a gross sense of desperation, neediness, and entitlement. These companies need to learn how to STFU sometimes.
I could be wrong but, all in all, buy a .com for your "ai" product, such that you survive the Dot-ai bubble [1]
I Love LLM's though!! Amazing math and tech.
[1] - https://en.wikipedia.org/wiki/Dot-com_bubble
https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Bus...
> While only 40% of companies say they purchased an official LLM subscription, workers from over 90% of the companies we surveyed reported regular use of personal AI tools for work tasks. In fact, almost every single person used an LLM in some form for their work. In many cases, shadow AI users reported using LLMs multiples times a day every day of their weekly workload through personal tools, while their companies' official AI initiatives remained stalled in pilot phase.
Corporate initiatives are failing, but people are using LLMs like crazy at work. This story is not the bombshell it's made out to be, in fact it could even go in the other direction.
From the article though:
> But researchers found most use cases were limited to boosting individual productivity rather than improving a company’s overall profits.
What does that even mean?
[1] Website: https://nanda.media.mit.edu/, FAQ: https://projnanda.github.io/projnanda/#/faq_nanda
HN is turning into reddit, where people look at the title, come to the comments, and post if they agree with the title or not.
The MIT report was created by project NANDA, a very pro-AI group. Read about them here: https://nanda.media.mit.edu/
The original Fortune article here: https://fortune.com/2025/08/18/mit-report-95-percent-generat... cites specifically generative AI pilot programs.
From the article: "But for 95% of companies in the dataset, generative AI implementation is falling short. The core issue? Not the quality of the AI models, but the “learning gap” for both tools and organizations. While executives often blame regulation or model performance, MIT’s research points to flawed enterprise integration."
You could have a pile of cash and 95% of companies will fail to see zero return because they don't know how to pick up the money.
Trying to claim victory against AI/US Companies this early is a dangerous move.
You could make that claim for the software industry, but I’m pretty sure a big part of the US moat is due to oligopolies, lock-in effects, or corruption in favour of billionaires and their ventures.
Too young to remember GSM?
"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize."
https://news.ycombinator.com/newsguidelines.html
If my comment can be characterized as flamebait, it has to be to a lesser degree than the parent, right?
And I'm not even claiming that the situation applies. If you take the strongest plausible interpretation of my comment, it says that if indeed this whole AI bubble is hubris, if indeed there's a huge fallout, then the leaders of this merry adventure, right now, must feel like Napoleon entering Moscow.
But well, anyways, cheers dang, it's a tough job.
[1]: the strongest possible interpretation of "This is how America ends up being ahead of the rest of world with every new technology breakthrough" is arrogance, right?
[0]https://www.researchgate.net/figure/Napoleon-march-graphic-C...
For example a study from METR found that developers felt that AI sped them up by 20%, but it empirically it slowed them down by 19%. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
How you use AI will depend on the model, the tools (claude-code vs cursor vs w/e), your familiarity and process (planning phases, vibe coding, etc.), and the team size (solo dev versus large team), your seniority and attention to detail, and hard to measure effects like an increased willingness to tackle harder problems you may have procrastinated on otherwise.
I suspect we're heading to a plateau. I think there's a ton of polish that can be done with existing models to improve the coding experience and interface. I think that we're being massively subsidized by investors racing to own this market, but by the time they can't afford to subsidize it anymore, it'll be such a commodity that the prices won't go up and might even go down regardless of their individual losses.
As someone who knows they are benefitting from AI (study shmuddy), I'm perfectly fine with things slowing down since it's already quite good and stands to be much better with a focus on polish and incremental improvements. I wouldn't invest in these AI companies though!
XD
Look, I get it, I still use it, but you have to admit, people also think that various bogus home remedy totally helps them get over a cold faster. There's absolutely a possibility it in no way makes us faster.
Now, you did say "benefit", that's more broad, and you implied things like polish, I've seen others mention it just makes the work easier, that could be a win in itself (for the workers). Maybe it's about accessibility. Etc.
I do think though, right now, we're all in the "home remedy" territory, until we actually measure these things.
I’m not pushing Amway, I don’t own any crypto, and I’m bearish on the S&P right now due the market cap concentration at the top. And yet I swear that claude code is working for me quite well.
> Now, you did say "benefit", that's more broad, and you implied things like polish, I've seen others mention it just makes the work easier, that could be a win in itself (for the workers). Maybe it's about accessibility. Etc.
Yes exactly, and this is the (ambiguous) metric that I actually care about. I suspect this study will go down in history as useless and flawed, not to be overly harsh :)
I think the real problem is, it's just a bit too early, but every CEO out there dreams of being lauded for their visionary take on AI, and nobody wants to miss the bus. It's high-leverage tech, so if it (some day) does what it's supposed to do, and you miss making the investment at the right time, you're done.
- everyone and their mother are doing a "generative ai program" right now, a lot of times just using the label to try to get their project funded, ai being an afterthought
- if the 1 out of 20 projects is game-changing, then you could argue right now people should actually be willing to spend even more on the opportunity, maybe the number should actually be 1 in 100. (The VC model is about having big success 1 in 10 times.)
- studies of ongoing business activities are inherently methodologically limited by the data available; I don't have a ton of confidence that these researchers' numbers are authoritative -- it's inherently impossible to truly report on internal R&D spend especially a private companies without inside information, and if you have the inside information you likely don't have the full picture.
So far business is booming and clients are happy with both human interactions with senior engineers as well as a final deliverable on best practices for using AI to write code.
Curious to compare notes
(I'm not really offended honestly. Startups will come crying to un-vibe the codebases soon enough.)
Saved them hours of work.
Of course, they didn't spend on "AI" per se.
Most people don't know how to meta their job functions, so AI won't really be worth it. And the productivity gains may not be measurable ie: "I did this in 5 minutes instead of 500, so I was able to goof off more."
I am not going to trust it without actually going over the paper.
Even then, if it isn't peer-reviewed and properly vetted, I still wouldn't necessarily trust it. The MIT study on AI's impact on scientific discovery that made a big splash a year ago was fraudulent even though it was peer reviewed (so I'd really like to know about the veracity of the data): https://www.ndtv.com/science/mit-retracts-popular-study-clai...
The story is a "Pick your narrative" one.
https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Bus...
If you do not do it, you get left behind and cannot compete in the marketplace.
I took a business systems administration course like 20 years ago, and they knew this was the case. As far as we can tell it's always been the case.
IT doesn't create massive moats/margins because price competition erodes the gap. And yet if you do not keep up you lose.
It's definitely a boon for humanity though, in the industries where technology applies things have been very obviously getting much cheaper over time.
(Most notably American housing has been very very resistant to technological change and productivity gains, a part of the story why housing has gone way up) - https://youtu.be/VfYp9qkUnt4?si=D-Jpmojtn7zV5E8T
>only 5 percent of custom enterprise AI tools reach production
>95% of Enterprise AI Pilots Fail to Boost Revenues
>Why are 95% of GenAI pilot projects failing?
>95% of Companies See 'Zero Return'
>5% of integrated AI pilots are extracting millions in value
>95% of generative AI implementations in enterprise 'have no measurable impact on P&L'
In this case, that's NVDA
Crypto's over, gaming isn't a large enough market to fill the hole, the only customers that could fill the demand would be military projects. Considering the arms race with China, and the many military applications of AI, that seems the most likely to me. That's not a pleasant thought, of course.
The alternative is a massive crash of the stock price, and considering the fact that NVIDIA makes up 8% of everyone's favorite index, that's not a very pleasant alternative either.
It seems to me that an ultra-financialized economy has trouble with controlled deceleration, once the hypetrain is on it's full-throttle until you hit a wall.
Data centers might, but then they'll need something else to compute, and if AI fails to deliver on the big disruptive promises it seems unlikely that other technologies will fill those shoes.
I'm just saying that something big will have to change, either Nvidias story or share price. And the story is most likely to pivot to military applications.
AI had led to significant operational improvements. The speed of root causing has increased by multiple times and the time spent head-desking or chasing leads have reduced significantly - even if the AI is wrong the first few times, it is recognizable that it would sometime be a rabbithole that I myself were likely to spend time - a lot longer time in.
It obviously cannot do these things by itself because it can arrive at the wrong conclusion all the time and be pretty stubborn about it, but at the same time to say there are no benefit is like throwing a bike away because you fell riding it the first time and go back to walking.
For some reason, I'm thinking most of the money went to either inferencing costs or NVidia.
This is confusing.. it's directly saying AI is improving employee productivity, but that's not leading to more business profit... how does that happen?
One trivial way is that the increase of productivity is less than the added cost of the tools. Which suggests that (either due to their own pricing, or just mis-judgement) the AI companies are mis-pricing their tools. If the tool adds $5000 in productivity, it should be priced at $4999, eventually -- the AI companies have every motivation to capture nearly all of the value, but they need to leave something, even if just a penny, for the purchasing company to motivate adoption. If they're pricing at $5001, there's no motivation to use the tool at all; but of course at $4998 they're leaving money on the table. There's no stable equilibrium here where the purchasing companies end up with a /significant/ increase in (productivity - cost of that productivity), of course.
Sounds like the AI companies are not so much mispricing, as the companies using the tools are simply paying wayyy too much for the privilege.
As long as the companies keep paying, the AI companies are gonna keep the usage charges as high as possible. (Or at least, at a level as profitable to themselves as possible.) It's unreasonable to expect AI companies to unilaterally lower their prices.
It’s pretty clear to anyone who’s using this technology that it’s significant. Theres still tons to work out and the exact impact is still unknown. But this cat isn’t going back in the bag.
I disagree entirely. It’s neat, and it’s a marginal improvement over current-year google, but significant is an overstatement.
Here's the truth: NO ONE KNOWS.
What part of No One Actually Knows do people not understand? This applies to both the "AI WILL RULE THE WORLD MUAHAHA" and "AI is BIG BIG HOAX" crowd.
I think we should actually ban all digital art platforms, no Photoshop, no special effects, all hand drawn. And I'll use some weird weaponized empathy calling out for the human soul and human creativity.
What a toxic bunch.
You're not standing up for art and culture. You're not asking for a "little reflection". You are however just being a cynic. And cynicism is toxic. It's bad for health. It's a weird affliction. Worse it's actively harmful to society.
Optimism is better. Tools that create abundance are better. Managing scarcity is dystopian, and ultimately harmful. It's a mindset that needs to be purged. Creating abundance is a far superior mindset.
Let me tell you what “actively harmful” for society is
Actively harmful to society is building platforms that extract value through the reward and encourages of antisocial behavior on a scale before.
> Managing scarcity is
Managing scarcity is reality. It’s strange how capitalists have suddenly thrown out the concept in the favor a “Muh Star Trek future is here” all because of some impressive chatbots.
> Creating abundance is a far superior mindset.
What abundance has been created besides the horde of garbage and slop that is
Or perhaps you simply mean the abundance of money thrown around gamblers?
What menial about knowledge work, anyway?
Like is the conclusion we shouldn't even try? This kind of thinking ridiculous.
Not sure where you read me saying that, but perhaps this could be a starting point to help you: https://www.ed.gov/adult-education-and-services/adult-educat...
I recently ported over a fastapi app to Django with Claude code and it was at least twice as fast as I probably would’ve been able to do myself, and I only had to somewhat pay attention. What would’ve been a pretty intense few days turned into about 2 hours of mindless work while tens of thousands of lines were ported over, tested, and refactored
I want to know more about the 5% who got it right. What are their use cases ?
So their feature is not just text to speech, but a reading of a summarized version of the articles. But here is the problem. The documentation has no fluff. You don't want a summary, you want the actual details. When you are reading the document that describes how the recovery fee is calculated, you want to know exactly how it is calculated.
I've ran it on multiple documents and it misses key information. An unsuspecting user might take it at face value. So this feature looks impressive, but it misses the entire point of documentation. Which is *preserving the details*.
What's going on? I find all of these pretty sus.
I honestly don't think it matters though. Feel free to disagree with me but I think the money is irrelevant.
The only thing that actually matters is the long run is the attention, time, and brain space of other people. After all that's where fiat currency actually derives it's value. These Gen AI companies have captured a lot of that extremely quickly.
OpenAI might have "burned" billions but they way they have wrung themselves into seemingly every university student's computer, every CEOs mind, the policy decisions of world leaders, ever other hackernews post, is nothing short of miraculous...
Incorrect. He did check, and decided to lie.
The article does call out clear issues companies have with AI workflows etc. and those are likely real problems, but if you're saying *zero* return those aren't the root cause problems.
- Make edits to the solid looking logos & web designs it spits out. Instead it creates brand new logos and designs ..not what i asked it to do!
- Front end code it doesn't spit a zip file with all the images. It does speed up my design/development (I use code to design) process where I use to design/develop in the browser using a bootstrap template.
Maybe it will finally figure it out and or maybe its just the Wizard of Oz .. a facade that grabs designs off the web and mixes some up but can never make edits.
It 100% turned out to be a bubble and yet, if anything, the internet was under-hyped. The problem in 1999 was that no one really knew how it was going to play out. Which investments would be shrewd in retrospect, and which ones would be a money pit?
When an innovation hits, it takes time to figure out whether you're selling buggy whips, or employing drivers who can drive any vehicle.
Plenty of companies sunk way too much money into custom websites back in 99, but would we say they were wrong to do it? They may have overspent at a time when a website couldn't justify the ROI within 12 months, but how could they know? A few short years later, a website was virtually required for every business.
So are companies really seeing "zero return" on their AI spend, or are they paying for valuable lessons about how AI applies to their businesses? There may be zero ROI today, but all you need to do is look at the behavior of normal people to see that AI is not going anywhere. Smart companies are experimenting.
AI is predominantly replacing outsourced, offshore workers
https://news.ycombinator.com/item?id=44940944
PDF report that was taken down/walled: https://web.archive.org/web/20250818145714/https://nanda.med...
... Well, probably yes, but I don't have the data to do it.
Executives mistook that novelty for a business revolution. After years of degraded search, SEO spam, and “zero-click” answers, suddenly ChatGPT spat out a coherent paragraph and everyone thought: my god, the future is here. No - you just got a glimpse of 2009 Google with autocomplete.
So billions were lit on fire chasing “the sliced bread moment” of finally finding information again - except this time it’s wrapped in stochastic parroting, hallucinations, and a SaaS subscription. The real irony is that most of these AI pilots aren’t “failing to deliver ROI” - they’re faithfully mirroring the mediocrity of the organisations deploying them. Brittle workflows meet brittle models, and everyone acts surprised.
The pitch was always upside-down. These things don’t think, don’t learn, don’t adapt. They remix. At best they’re productivity duct tape for bored middle managers. At worst they’re a trillion-dollar hallucination engine being sold as “strategy.”
The MIT study basically confirms what was obvious: if you expect parrots to run your company, you get birdshite for returns.
They got a majority of the country hooked into AI without truly understanding its current limitations. This is just like digital currency bubble/fad that popped a couple of years ago.
What most companies got out of it is a glorified chatbot (ie, something that was possible in 2014…) at 1000X the cost.
What a sad joke. Innovation in this country is based on a lie, fueled by FOMO.