this post was submitted on 13 May 2025
370 points (95.1% liked)

Technology

70107 readers
2313 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] potoo22@programming.dev 147 points 5 days ago (10 children)

No publisher is going to pay a professional to narrate their audiobooks when they can have AI do a shitty job for much less.

A shitty narrator can get me to hate a book I like. A great narrator can bring the characters to life, enhance the experience, and turn me from a listener to a fan. I've searched for books by narrators like Nick Podehl and Jeff Hayes and bought audiobooks I wouldn't have otherwise.

[–] monkeyman512@lemmy.world 27 points 5 days ago (1 children)

That depends entirely on how profitable it is and how much they can get authors onboard.

I do agree that a good narrator delivers a performance that adds the work. James Marster will always be Harry Dresden in my head.

[–] empireOfLove2@lemmy.dbzer0.com 14 points 5 days ago (2 children)

That depends entirely on how profitable it is and how much they can get authors onboard.

A. Anything can be profitable when the cost to generation will be counted in singles of dollars instead of multiple thousands for a good narrator. They don't even have to sell many to turn a profit too.

B. You think authors are going to have a choice? Lmfao. It's the publishers that hold any real power and they will jump all over everyone's IP with AI slop to make an extra three cents.

[–] taladar@sh.itjust.works 2 points 5 days ago

It’s the publishers that hold any real power

It might be time to finally change that, especially considering what a piss poor job they have been doing for decades at their own part of the production of media.

[–] lemonskate@lemmy.world 16 points 5 days ago (1 children)

I tried, and failed, to get into audio books for years. Then I listened to Dungeon Crawler Carl narrated by Jeff Hayes and what an absolute delight it was. There's no way I would've gotten even 10 minutes in if it was one of those soulless AI voices instead.

[–] tehn00bi@lemmy.world 2 points 4 days ago

Currently listening to the first book.

[–] Uli@sopuli.xyz 9 points 5 days ago (3 children)

I made some AI animated content that I never released because I don't have the rights to the voices I was using. Even though I was blending several voices together to make them unrecognizable, it made me uncomfortable.

But in the process I learned the capabilities and limitations of AI voices. If you're going purely from text to speech, it's horrendous (as far as I experienced). Very robotic. It's a bit better when melodic information is included (as in Suno) but still sounds like AI.

But when I recorded my own voice saying the lines and then converted it to another voice, it took all of the nuance of my line reads and converted it into the other voice.

So, would your opinion change if it turns out they're going to use purchased voice rights to have a single narrator perform the whole book and then use AI to turn the narrators voice into a full voice cast?

I could see how it would allow lesser known books to have a better experience with a truly separate voice for each character, but I could also see how this might drive out lesser known/minority voice actors. Not advocating one way or another, just providing a piece of this conversation I think we should bear in mind.

[–] taladar@sh.itjust.works 4 points 5 days ago (1 children)

So, would your opinion change if it turns out they’re going to use purchased voice rights to have a single narrator perform the whole book and then use AI to turn the narrators voice into a full voice cast?

It would make me hate it even more because I already hate the existing full cast of humans audio dramas 99% of the time and actually prefer a single (or low number of) narrator approach.

[–] Uli@sopuli.xyz 5 points 5 days ago

Completely fair. I kind of like them. They did it for Redwall and I listen to those books on long drives sometimes. It works for me. Now I guess the advantage could be to have both versions and get to choose which you listen to--but even I'm skeptical that a corporation would have that much regard for the preferences of its consumers.

[–] Kornblumenratte@feddit.org 3 points 5 days ago

Using different voices to read different parts of a book turns an audiobook into a bad audio play, and arguably, a bad audio play is worse than a mediocre audio book.

What audible misses is, that, while reading is a technique that can be automated, narrating is an art. They can use AI to read books, they cannot use AI to narrate books.

Your example of AI use is a good example of this: AI can read your content. AI can enhance your capabilities. But only you can narrate it.

[–] BlameTheAntifa@lemmy.world 3 points 5 days ago

Oh. That’s an interesting use-case I hadn’t considered.

[–] 48954246@lemmy.world 3 points 5 days ago

Nick Podehl is such an amazing narrator. The voices and performance are amazing.

I've been slowly getting through the Kel Kade books and the narration just makes it for me

[–] echodot@feddit.uk 4 points 5 days ago (1 children)

Honestly audible are terribles. They are constantly doing things that annoy me, like they must have a team somewhere that spends its days going, how can we kill this golden goose?

They are going through and replacing audiobooks recorded in the 1980s with new ones which in theory should improve their quality but they're getting rid of the classic sounds of those books.

[–] futatorius@lemm.ee 3 points 5 days ago

like they must have a team somewhere that spends its days going, how can we kill this golden goose?

I wouldn't put it past Bezos to have an actual enshittification department.

[–] Maeve@kbin.earth 3 points 5 days ago (1 children)

Maybe we'll start reading again.

[–] misterdoctor@lemmy.world 13 points 5 days ago (1 children)

There is literally zero shame in someone consuming audiobooks, and it’s deeply weird to act like something is lost to you if others enjoy them. And this is coming from someone who virtually never listens to audiobooks.

[–] Maeve@kbin.earth 1 points 5 days ago (2 children)

I never said there was. I offered an alternative. . Outrage is misdirected and it's by design. There are constructive ways to direct it

[–] Kornblumenratte@feddit.org 7 points 5 days ago

Reading is not an alternative to listening. Both have different use cases. You cannot read while driving, to name just one.

[–] misterdoctor@lemmy.world 8 points 5 days ago

“Maybe we’ll start reading again” obviously implies that something is lacking presently and that with luck, we’ll go back to the way things were

Not sure if you’re saying I’m outraged but I promise you I’m not, just thought it was lame to try and imply audiobook enjoyers were somehow less than because of how they prefer to enjoy stories

[–] brrt@sh.itjust.works 2 points 5 days ago (1 children)

A shitty narrator can get me to hate a book I like.

And that is where I see potential for AI. There are quite a few books which I’d love to listen to but they are all narrated by a guy whose narration I can’t stand. AI would open the possibility to choose a voice and I might actually get to enjoy those books. It’s Amazon though so the ethical implications and quality concerns are something I’m worried about.

[–] Kornblumenratte@feddit.org 8 points 5 days ago (2 children)

Did you ever heard a single AI-narrated content that did not make you run away screaming?

[–] ArchmageAzor@lemmy.world -2 points 5 days ago (2 children)

You think they'll be narrating books with Tiktok TTS?

[–] Kornblumenratte@feddit.org 2 points 4 days ago (1 children)

To rephrase my question: where can I listen to an example of good AI spoken content?

[–] ArchmageAzor@lemmy.world 1 points 4 days ago (1 children)

First thing that comes to my mind would be Dougdoug. He's a streamer who messes around with AI a fair bit for funny content, including using AI-generated voices at times.

https://youtu.be/2pbhnyrpHmY?t=7h19m42s

[–] Kornblumenratte@feddit.org 1 points 3 days ago (1 children)

Hm. I wasn't able to listen to all 9:53:57, but in the samples I watched I heard a voice resembling the classical computer voice of Science Fiction movies of the 70s. Better than most YouTube AI generated audio content, but good enough to narrate audio books? Well, we'll accustom to anything, I guess.

[–] ArchmageAzor@lemmy.world 1 points 3 days ago (2 children)

I don't know if my timestamp went through, but the part I linked to was at 7h/19m/42s. That's the relevant part, not necessarily the entire video. That's a showcase of good AI voices.

[–] Kornblumenratte@feddit.org 1 points 2 days ago

My browser eats timestamps, til. And yes, that is impressive.

[–] Kornblumenratte@feddit.org 1 points 2 days ago

Thanks, I'll listen into it.

[–] futatorius@lemm.ee 4 points 5 days ago (1 children)

Some use even worse, if YouTube content is any indication.

[–] ArchmageAzor@lemmy.world -1 points 5 days ago (1 children)

But you think Audible would use those to narrate books?

[–] Glytch@lemmy.world 1 points 4 days ago

I think an Amazon owned company would actually default to Alexa, who's voice is equally terrible. Why pay more to develop a better voice?

[–] Kusimulkku@lemm.ee -4 points 4 days ago (1 children)

I'm not sure why AI would automatically mean it's doing a shitty job.

[–] utopiah@lemmy.world 7 points 4 days ago (1 children)

Because... the tool has no understanding of anything? It reads written words, yes, but no intention, no cultural context, no intonation. Unless everything is spelled out like a script, then it will not sound great, would it?

[–] Kusimulkku@lemm.ee 1 points 4 days ago* (last edited 4 days ago) (2 children)

Someone can manually go through it and correct and edit it, as one would a regular, human made recording. It's not rocket science exactly. It wouldn't be a story time for children but it would probably be alright for more plain stuff

[–] SynopsisTantilize@lemm.ee 1 points 4 days ago

These people just want to hate AI. Read through and see how many times they complain about copywrited material stolen, but claim piracy is the solution.

[–] utopiah@lemmy.world 1 points 4 days ago (1 children)

If the "fix" for an AI implementation in a use case is, again, to manually correct it and find a less demanding audience then... yes, by definition it's shitty.

The point isn't that it's infeasible, just that it will be low quality.

[–] Kusimulkku@lemm.ee 1 points 4 days ago* (last edited 4 days ago) (1 children)

I mean you have to correct and edit human made stuff too, doesn't mean it's shit lol

If you want the stuff read out and don't care for the radio type stuff, I'd imagine the better voice AIs do a pretty good job. And I personally prefer the more neutral voices to the story time stuff, so works for me.

[–] utopiah@lemmy.world 1 points 4 days ago (1 children)

This is me just speculating here but if they follow the path of this CEO who fired his human staff to replace it by AI... then rollback admit it's shit https://gizmodo.com/klarna-hiring-back-human-help-after-going-all-in-on-ai-2000600767 then my bet is that it's not done to improve quality but rather margins.

If AI is done alongside professionals, and done so ethically (not stolen training data, not ignoring ecological cost by pumping water in dry areas to cool down GPUs, etc) and economically (i.e. not having it "cheap" now but once a monopoly position is obtain, raise prices for a captive set of consumers) then yes it can be potentially empowering. This though is pretty much never the case.

That being said, if one "just" want read aloud, there are plenty of FLOSS alternatives and I believe Mozilla even a TTS/STT system based solely on voluntary voices.

[–] Kusimulkku@lemm.ee 1 points 4 days ago (1 children)

It's a company, of course it's done to increase profits. I'm just saying it being AI doesn't automatically mean it's shit, it could be done just fine. AI is a tool, the end result depends on how that tool is used.

[–] utopiah@lemmy.world 1 points 4 days ago* (last edited 4 days ago)

Like I try to highlight, in most cases it's a shitty tool, doing a bad job, trained on stolen data, requiring a TON of energy and often used to put people out of work (and failing at it, cf news above).

So... sure, it's "just" a tool and in theory, it can be made the right way and used in a good context.

It is rarely the case though. Here specifically we are talking about Amazon, a company that has from its inception been built to be a monopoly, relying on AWS a service that is basically destroying the Internet by removing its decentralized nature.

So... again even if the tool would in theory itself be used the right way, build the right way, the company using that tool is problematic.

TL;DR: in theory, yes, in practice here, no.

[–] catloaf@lemm.ee -5 points 5 days ago (1 children)

For fiction, yeah, that's true. For nonfiction, this could work pretty well.

I'm still generally opposed to it because it's using the work of existing voice recording without compensation, though.

[–] Zwuzelmaus@feddit.org 5 points 5 days ago

nonfiction, this could work pretty well.

Only in rare cases.

If you have for example some explanations to a complex topic, then a super emotionless voice would still make you hate it and block you from learning it. Even the most dry and hard topics need some good and alive voice in explanations.

If it is just some reference list, where you need to search and hear small parts of it, then it could be Ok.