Creating Auto-Narrated Audiobooks on Google Play

Based on Chuck Litka’s post Adventures in Audiobooks, I decided to go with Google’s audiobook option only. So this post refers exclusively to auto-narrated audiobooks on the Google Play platform.

I have finished setting up and editing one audiobook, which is now live. From that experience, I can make the following observations.

Once you have an ebook live on Google Play Books, it’s surprisingly easy to create an audiobook. That said, I fumbled my way through the process, and my first book went live in an unedited state. I will have to figure out how to keep that from happening with any others I publish, because it is absolutely necessary to listen to the entire book before finalizing it. There are step-by-step instructions in something called the Auto-Narrated Audiobooks Learning Center, but when I started the process, I found myself being hustled through a number of steps without really knowing what was happening. No harm done, however, as far as I know.

On the plus side, once the audiobook was live, it was easy to whip it into shape. I called up the Audiobook Text, which allowed me to both listen to and see the text. I could start and stop playback, make changes in the text, and save them. It’s possible to have more than one version of an audiobook (with different narrator voices, for example), but only one can be live.

There is a large number and variety of narrator voices available, including male and female voices in different age ranges (18-30, 31-45, 45-60, and 60+) with “standard” American, British, or Australian accents. Voices for a few languages other than English are available, but it’s recommended to use these only for texts in those languages. There are no options for English spoken with accents other than those I’ve already mentioned, or with regional accents.

It’s possible to use more than one voice in a book! Theoretically, you could have dialogue in as many voices as you have characters, but I think this would complicate the setup process. So far, I’ve used only one main voice, with a second one to read brief quotations that open a few chapters in my book.

Changes made to the audiobook text are not reflected in the ebook version. One hazard here is I was sorely tempted to improve the text! I don’t know how many instances of the word “that” I was tempted to delete, but I decided to make no changes except those needed to improve or correct the speech. I want the audiobook, ebook, and paperback versions to be essentially the same, stylistic problems and all. However, I did find it helpful to add or delete commas on occasion.

The computer-generated voices sound human, much more so than the rather robotic voice of Word’s text-to-speech feature. Nevertheless, they can’t be expected to represent the full range of emotion that may be found in a work of fiction. Google’s info about auto-narrated audiobooks cautions that they work best for texts that do not require a lot of drama and emotion. Still, I found the voices I selected to be better than adequate. Quite often, the main narrator was spot-on, to the point he seemed to embody the first person narrator of the book.

Quirks and Issues

  • Stress and emphasis does not always fall where it should in a sentence. This can’t be changed by adjusting the speed of the narration, but deleting or adding commas helps sometimes. Still, I admit there are occasions when a word or sentence sounds a little “off.”
  • Sometimes there is an awkward pause between a word or sentence spoken by a character and the dialogue tag such as “he said” or “I asked.” The best solution might be to delete selected dialogue tags, but I resisted the temptation to do this, not wanting my spoken and written texts to diverge. But this is another reason to use fewer dialogue tags!
  • Weirdly, a few names are pronounced quite differently when a possessive is added. For one name, I had to provide a correct pronunciation for possessives because the default was unacceptable.
  • Homographs are common enough that you have listen for instances where the wrong pronunciation pops up. For example, the default pronunciation of the word “read” is the present tense (pronounced “reed”). When the past tense pronunciation (“red”) was needed, I had to intervene. Fortunately it’s easy to fix these; a right click on the word in the text takes you to both versions, and you can listen to them before selecting the correct one. There is an option to change the pronunciation of all instances of a word, or only one.
  • Abbreviations such as Mr. and Dr. are usually pronounced correctly, but I encountered a few situations where “Dr.” came out as “drive,” for some reason. I fixed these by spelling out the word.
  • Uncommon words, place names, or words in other languages may be mispronounced. In such cases, you can insert a different pronunciation by spelling the word differently, speaking it into your computer’s mic, or by using the International Phonetic Alphabet. I actually did that for a few place names; fortunately Wikipedia sometimes provides IPA spellings in its articles, so I was able to reproduce them with good results. You can listen to the new pronunciation before selecting it. This was about the most challenging part of the editing process.

So what do I think of Google’s Auto-Narrated Audiobooks?

I think it’s an excellent option for authors who would not otherwise consider producing audiobook versions of their books. It doesn’t cost anything and produces acceptable results.

There’s no doubt that a competent human reader or voice actor would produce a superior listening experience, but at a cost that’s likely prohibitive for most indie authors. Some may have the talents and equipment to be their own reader, but I suspect those are a minority. The AI-narrated option is available for free to anyone.

A few more considerations:

  • You have to publish your books as ebooks on Google Play before you can create audiobooks. Google requires book files in ePub, not Word. I used Calibre (a free program) to convert a copy of the Word doc I used for the Amazon Kindle version of my book into an ePub, which I then uploaded to Google Play Books. It helped that the Word doc was properly formatted and had a linked table of contents.
  • You need a square cover image for the audiobook, but it looks like the rectangular ebook cover image is squared up automatically with a block of matching colour, so you can get away with that.
  • You need to commit the time needed to listen to your audiobook from start to finish in order to correct any serious or even mildly annoying problems in the finished product. The book I worked with is a fairly hefty tome, which ended up being more than 15 hours of listening time. It took me a solid week to complete, spending 2 to 3 hours each day. (Actually, this reminded me why I prefer reading fiction rather than listening to it.)

I encourage anyone who wants to offer their books in audio format to give this a try. The only cost is your time.

Once I’ve converted one more book to auto-narrated audio format I will write a post on my own blog with more details. That should appear in another week or two.

Featured image from Pexels

Adventures in Audiobooks

As promised, here is my report on my experiences with the various free programs to convert ebooks into auto-generated audiobooks.

The first off Google.

Google’s conversion process offers 12 female and 12 male voice options with various accents.

You can use different voices for different characters within the book.

You can listen to the book, modify the pronunciation of words, and edit the text of the book. Improvements to the technology are automatically applied to all audio books.

You can charge and change your price as you like, including free.

The process is pretty simple, given the many options.

It takes only hours for the audiobook to be available for sale.

Next Apple via Draft2Digital.

This service offers you essentially no options. Apple/D2D chooses from 2 female and 2 male voices according to the story’s genre.

You can not listen to the narration before the book is released nor modify pronunciation or text.

You can charge what you like. Changes after release will cost money. You cannot withdraw the audiobook in the first six months.

The process takes a minute, given that you essentially have no options to choose from beyond price.

You can set your price, including free.

It takes months for audiobooks to be available for sale. Five of the twelve ebooks I uploaded on the first of January 2024 remain unconverted on the 29th of April 2024. Conversions appeared at random over the course of five months.

Lastly, Amazon.

You currently have a choice of five female voices including one with a British Accent, and three male voices. More are promised coming this summer.

Promised upgrades this summer include using different voices for different chapters, and improvements to the voices. It seems that you will need to manually republish the book to receive the upgrades.

You can listen to your audiobook and edit pronunciation and the speed a word is spoken prior to release.

You are limited to books under about 240K words, or 27 hours of audiobook narration.

Books require a table of contents. The Kindle Create app will add tables of contents automatically.

The process is simple, and depending on how much you want to review and modify, fast.

Minimum price is $3.99. Audiobooks are listed in both Audible and Amazon

Are auto-generated audiobooks worth it?

Note: My audiobooks are free on Google & Apple.

Google – First month sales 431 audiobooks vs 288 ebooks. Second month 1,179 audiobooks vs 506 ebooks, with 5,813 audiobooks sold April 2022 – Dec 2022. This month, April 2024 I’ve sold 461 copies of both audiobooks and ebooks to date.

Apple – Given the erratic release of my books, and the limits of D2D reports, I’ll offer my March and April-to-date numbers. In March I sold 33 audiobooks vs 63 ebooks. In April to date (28th) I’ve sold 51 audiobooks vs 83 ebooks. Five month total: 127 audiobooks sold.

Amazon – I am only including the sales of books at retail price. In March I sold five $3.99 audiobooks vs 40 paid books. Of those 40, 24 were my new releases. In April I sold 2 audiobooks vs 15 retail priced ebooks.

Major downsides.

Google – the necessity of converting your manuscript into an epub on your own which may not provide a perfect ebook to convert. The last book I converted missed chapter headings, so they did not appear in the table of contents for the audiobook, though the text was there. I changed the chapters titles to include them.

Apple – The lack of any options or control over the product and their whimsical attitude to actually publishing the audiobook.

Amazon – the limit to the length of the book, the limits to pricing.

My takeaway.

Audiobooks increase total sales significantly, and can boost ebook sales as well – in proportion to ebook sales volume. They extend your reach into a new and growing market. And, well, you’re in the game at no expense to you.

Auto-generated audiobooks provide an acceptable listening experience, especially if priced below human-voiced audiobooks. I’ve had no reviews critical of the narration, and rating parallel the ebook version. They will only get better over time. And probably fast.

All three programs are free to use vs hundreds to thousands of dollars needed for a human to read your book. This gives you flexibility in pricing.

Apple Audiobooks; Not for the Fainthearted

From Debra Purdy Kong’s blog via her comments on Audrey Driscoll’s blog, I recently learned that Apple is offering to convert the ebooks into auto-narrated audiobooks – for free. Audiobooks are popular and the price was right, so I investigated the prospect. I found that I did not have to get my books into the Apple Store on my own, rather the conversion is done in partnership with Draft2Digital. I already had my books on D2D, but I was only distributing them to the two European stores, using Smashwords for Apple and all the rest. I would have to switch to D2D to be able to take advantage of the offer. Which I did, and adding all the other stores while dropping Smashwords for distribution, while I was at it.

So here’s the deal. To create an Apple audiobook, you simply select your book on D2D, and click on the audiobook tab next to the description of your book. Here you are offered two options; one to to work Finaway Voice, and the other to have them auto-narrated by Apple. Clicking on the Apple narration takes you to a very (i.e. too) simple interface to create your audiobook.

First, you will need a square cover for your book, which you can either provide a (3000×3000 pixels) one, or let D2D make one from your ebook cover. I had made square covers for my Google audiobooks, but I had to up size them to the 3000×3000 size required by Apple.

Next you have a choice of two voice tones for narrators; a soprano voice, i.e. a female voice, or a baritone, i.e. male voice. And that, my friends, is the only choice you have. Apple actually has 6 AI narrators; one in each sex for fiction and romances, a slightly different one for science fiction and fantasy, and a very serious one for non-fiction. However, the narrator comes with the genre of your book, you don’t get to choose more than the genre.

After you select the sex of your narrator, and the genre of your book, you get to choose your price, and then agree that, among other things, that you will not be able to make changes in the audiobook without paying for them, and that you need to keep the book listed for at least 6 months. And that the conversion process may take up to 2 months .Agree and you’re done. If there is anything more to do after the audiobook is generated, nothing was mentioned.

Debra Purdy Kong decided not to go with the auto-narrated books for reasons you can read in her post. I decided to take advantage of the offer, not without some misgiving. The primary reason for signing on is that my auto-narrated Google books now account for 1/3 of my sales – over 12,000 copies since May of 2022 – and their ratings match their ebook ratings, so that it seems my auto-narrated customers are happy with Google’s results. I would think Apple wouldn’t do auto-narration half-assed, despite the total lack of control the author has over the final result, so I expect them to be as good as AI narrators get. Secondly, I sell the audiobooks for free, so I don’t have to weigh value/quality with price. If people want to look a gift horse in the mouth and complain, fine, but I won’t lose any sleep over it. Thirdly, all my books are first person narratives, so a single narrator is the natural way to read my stories. I’m not an audiobook fan, and from what I’ve sampled, I’ve found that narrators using different voices for different characters sound hokey. No doubt that’s just me. I know some audiobook fans buy books just to hear their favorite narrators. And finally, while my books on Google far outsell Apple, being able to offer my audiobooks right in the Apple Store for an attractive price, may prove to be popular. Time will tell.

I will admit that the total lack of control over the final product is a bit worrisome. I outlined my experience with Google’s audiobook procedure in this post. Suffice to say that you have not only far more options for voices – 12 different English voices for each sex; including American, British, Australian, and Indian accents – but the book is ready in hours, so that you can listen to it, or individual words, to hear how the AI pronounced them – and, if necessary, change how they are pronounced – before you publish the audiobook. For a writer of science fiction who makes up a lot of words for names and places, the ability to hear how they sound is kind of important. This lack of oversight and control makes the whole Apple audiobook venture not for the faint of heart. Or for the persnickety and /or control focused author. But nothing ventured, nothing gained, and trying new ventures is how I promote my books. I’ll report back in a few months to let you know how I fared.