Synopsis

The Bestseller Code (2016) reveals the remarkable story behind a newly developed computer algorithm with the power to unlock the secrets behind the most popular best-selling books. By analyzing over a thousand bestsellers, patterns have emerged that show us which themes, plots, styles and characters contribute to earning a book a spot at the top of the charts.

Who should read

  • Readers wondering what makes a book popular
  • Writers and authors looking for widespread success
  • Computer geeks who love a good algorithm

About author

Jodie Archer is an author and former editor at Penguin Publishing in the United Kingdom. She has a PhD from Stanford University and has acted as a consultant for various writers and literary businesses.

Matthew L. Jockers is an English professor at the University of Nebraska as well as the director of the school’s Nebraska Literary Lab. His previous academic writing explores the field of text mining and the digital analysis of writing.

The book content (15 minutes read)

What’s in it for me? Learn what really makes a bestseller.

Take a moment and think about the last really great book you read. Whether it was a riveting whodunit, a tragic love story or a 600-page fantasy saga, you probably didn’t stop to consider exactly why you liked it. You just enjoyed the ride.

For publishers, though, knowing which ingredients make a novel compelling is essential. The only way to keep a footing in the major league of publishing is to make sure your releases have that magic sauce that will send them spiraling up the best-seller lists.

And that’s no cakewalk. Historically, predicting a book’s success is no easier than predicting next year’s weather. Luckily, at least for publishers, we’re entering a new phase in which computers can help us determine which books will sell and which will not. This book take you behind the scenes of the best-seller code.

You’ll also learn

  • why sex doesn’t sell when it comes to novels;
  • why female authors trump their male counterparts when it comes to style; and
  • which novel achieved a perfect score according to the “bestseller-ometer.”

Publishers find it difficult to predict a bestseller, as literary quality doesn’t seem to be the defining factor.

These days, the internet is chock-full of lists that rank the best and worst of just about anything you can imagine. Most of these lists tend to be rather arbitrary, but there are still a few reliable popularity rankings that continue to be checked on a regular basis.

Among these more reliable lists are those that record the best-selling books in the United States. And, for as long as such lists have been around, they’ve made clear that what’s popular isn’t the same as what’s critically acclaimed.

The first list of best-selling books was published in 1891 by The Bookman, a London literary magazine.

It didn’t take long for critics to point out that popularity had nothing to do with quality. In fact, good sales and bad writing seemed to go hand in hand.

This hasn’t changed at all. Critics continue to scratch their heads over the success of E. L. James’s Fifty Shades of Grey, Dan Brown’s The Da Vinci Code and Stieg Larsson’s “girl trilogy,” which began with The Girl with the Dragon Tattoo.

Remarkably, Larsson wasn’t even around to help publicize his books, as he’d passed away before their publication. But that hindered their ascent to best-sellerdom as little as the critical assaults which pointed out their jumbled plots, limp characters and boring endings.

So it’s easy to predict that the best-seller list will be filled with poorly-written books. But since so many books are published each year, it becomes difficult to predict which ones will end up on that list.

According to Bowker, the US company that issues the ISBN identification numbers for books, around 50 thousand books of fiction are published every year – and this doesn’t include e-books, which don’t receive an ISBN.

From that amount, around 200 novels will make the New York Times (NYT) best-seller list each year, which is less than half of one percent of all the published books. And the percentage that manages to stay on the list for more than a week is even more miniscule.

This tiny percentage makes the job of predicting bestsellers a bit like guessing which numbers will win the lottery.

But that doesn’t mean these books don’t share similarities, which is what we’ll take a look at in the book ahead.

An algorithm to determine a novel’s success could help the future of the publishing industry.

For a while, the best guarantee of success was having your book added to Oprah Winfrey’s influential book-club reading list.

But, in 2010, the authors began to uncover a fascinating scientific method for predicting bestsellers.

They spent five years studying the components of best-selling novels, and, during that time, they spotted some remarkable patterns.

In fact, these patterns were so reliable and consistent that a computer algorithm called the “bestseller-ometer” was developed and tested.

Amazingly, the algorithm proved to be a highly accurate predictor, successfully picking 80 to 90 percent of the books that ended up on the New York Times best-seller list.

The authors fed previously published bestsellers into the computer as if they were anonymous manuscripts, without taking into account the author’s name or reputation.

The algorithm gave Dan Brown’s Inferno a 95.7 percent chance of best-seller status and Michael Connelly’s The Lincoln Lawyer a 99.2 percent chance – and, sure enough, both books had been ranked number one.

However, around 15 percent of the time, the algorithm would be slightly off, as in the case of Kathryn Stockett’s bestseller, The Help, which received a 50-percent chance of success.

Nevertheless, this algorithm could be a huge asset to the publishing industry, which is in dire need of some help.

Authors like Stephen King and James Patterson have been reliable producers of bestsellers for years, but they won’t be around forever and a new generation of consistent hit-makers has yet to emerge.

And one of today’s safest bets, J.K. Rowling, of Harry Potter fame, was very nearly not published; the first Harry Potter book was rejected 12 different times. If these skeptical publishers had had access to the algorithm, they would have seen that the first Harry Potter book had a 95-percent chance of becoming a bestseller.

It makes one wonder: How many other writers like Rowling have been rejected, and how many opportunities have publishers missed out on because they lack a reliable predictor?

Topics are of utmost importance for a novel’s success.

So what is this algorithm taking into consideration? Well, at the top of the list is the book’s topic, which shouldn’t be confused with the genre it fits into.

While bookstores separate everything into categories – such as science fiction, mystery and young adult – it’s a novel’s topic, not its genre, that determines its success.

For instance, the topics of love and crime are hugely popular and cross over into many different genres. The prominence of these two topics in any given novel may vary greatly, but their presence, in whatever degree, is much more important than the book’s genre.

Let’s look at how the algorithm breaks down Jodi Picoult’s House Rules, a family drama about a boy with Asperger's who gets accused of murder. The most predominant topic is “kids” (23 percent), followed by “crime” (10 percent), “legal settings” (7 percent), “domestic situations” (6 percent) and “close relationships” (2 percent). Even though the dominant topic is neither crime nor relationships, the combined presence of both contributed significantly to the algorithm’s prediction that the book would become a bestseller.

The algorithm can accurately determine these topics by examining every word and determining its context.

This is important because the word “body,” for example, could fall into multiple contexts. In Fifty Shades of Grey, that context will likely be sexual, while in The Girl with the Dragon Tattoo, it will probably be criminal or violent. The algorithm can easily tell the difference by analyzing the nearby words in a process called topic modeling.

This process is how the algorithm finds a book’s topics and adds up their proportions, as well as being the way it identifies the patterns that add up to a bestseller.

The most successful topic that the algorithm has identified is crime. Other topics vary, but the more nouns that reference crime in a book, the greater that book’s chance of success.

Sex, on the other hand, doesn’t sell nearly as well. Of all the best-selling books that have been analyzed, the topic of sex only appears .0009 percent of the time.

The use of emotion in a plot is key to a book’s success.

If you were to ask a group of people why they choose the books they do, most of them would make no mention of the quality of the prose; rather, what people seem to desire is an exciting emotional roller coaster.

This is why, despite the horrible reviews, Fifty Shades of Grey was a huge success – it delivered the big emotions that readers were looking for.

This book was a peculiar one for the algorithm as well, since a book about kinky sex should have a low probability of success. And, based strictly on the author’s writing style, the book would have only a 50-percent chance of success. But when the other topics, such as emotion, are taken into consideration, the algorithm gives it a 90-percent chance of being a bestseller.

Indeed, according to the algorithm, the main topic of Fifty Shades of Grey is an intimate human relationship with little conflict – not sex. And this is why readers adore the book. It allows them to indulge in an fairly uncomplicated, yet emotionally-charged romantic relationship.

Indeed, factoring in the emotional arcs of a novel’s story makes the algorithm even better at recognizing a book’s potential popularity.

This is pretty straightforward since a reader’s emotions will often reflect those of the main characters. When the hero of The Da Vinci Code experiences a moment of relief after a dramatic chase scene, the reader will breathe a sigh of relief as well.

The algorithm can chart the emotional beats of a story in a graph that curves up and down along with the book’s emotional highs and lows. The more ups and downs the book has, the more of an emotional roller coaster it will be for the reader, which therefore gives it a higher chance of success.

For Fifty Shades of Grey, there are so many ups and downs that the emotional chart of the novel looks like the rhythmic beat of techno music. The only other novel that achieved this same pattern was Dan Brown’s massive bestseller, The Da Vinci Code.

Most best-selling authors avoid fancy phrases and use a simple style.

Having a strong writing style is important for any writer, since this is the mechanism that is used to deliver the story’s plot, themes and characters. When this delivery is executed well, the author could very well have a bestseller on her hands.

Good authors often have such a strong style that their words are not unlike a linguistic fingerprint that can be analyzed and recognized.

In 2013, a new crime novel called The Cuckoo’s Calling by an unknown author named Robert Galbraith came out. There were rumors that Galbraith was a pseudonym, and that the true author was someone famous, trying out a new genre. And, sure enough, when the book was put through a computer, a near certain – and, as it turned out, correct – result came back: the real author was J.K. Rowling.

While these linguistic fingerprints might not be recognizable to the average reader, a computer usually needs just a few sentences to pick up all the necessary clues.

An algorithm can also help a new writer find out whether his style is ideal for a bestseller.

What the best-seller list tells us is that a winning style isn’t about clever turns of phrase or finding new ways of delivering old ideas; it’s actually about constructing common and boring sentences.

The algorithm measures style by looking at a variety of things such as syntax, sentence length and common words like “a,” “the” and “of.” And since it measures all of these characteristics against all other books, it can determine a book’s probability of success as well as identify an author’s unique style.

These measurements have identified a number of best-selling trends. For instance, best-selling books employ the word “do” twice as often as books that aren’t bestsellers, and they employ the word “very” half as often.

Bestsellers also tend to have short and clean sentences, which is why they contain relatively few adjectives and adverbs.

This may not make for the most interesting prose, but keeping it simple is smart if you want your book to be a smooth read for millions of people.

Female authors, or authors the algorithm thinks are female, score higher when it comes to style.

When we take an even closer look at the kinds of style that determine a popular book, another interesting trend emerges: when programmed only to consider a book’s style, the algorithm ranked female authors far above their male peers.

This is particularly interesting in light of that fact that no such gender advantage emerged when other factors like plot and theme were isolated.

And this isn’t factoring in an author’s reputation, either. When the algorithm looked at debut books, nine out of the ten books deemed most likely to succeed were written by women.

This also revealed an oddity in the algorithm’s ability to predict whether a writer was male or female, which it successfully did only 71 percent of the time. This led to closer inspection of exactly what details the program tends to pick up on.

For example, the algorithm was 99 percent sure that James Patterson’s novel Suzanne’s Diary for Nicholas was written by a woman. Sure, the book belongs to the predominantly female genre of romance novels, but there has to be a reason deeper than that for the mistake. The algorithm also thought Patterson’s thriller Four Blind Mice was the work of a woman.

The closer the authors looked, the clearer it became that what was being recognized was a mix of both cultural and gender signals.

For starters, male authors tend to use a more sophisticated literary style, which is why the work of Toni Morrison was mistaken for that of a male writer.

But it seems that female writers were identified as having a stylistic advantage because of their background. Many of the top female authors, such as Terry McMillan, have experience in journalism, which teaches writers the kind of blunt and simple style that helps ensure a bestseller.

In James Patterson’s case, it is believed that his background in advertising helped make his style broadly appealing, much as journalistic experience has helped many women.

A title that references a strongly written character can make all the difference.

Here’s a trend that you might have picked up on: What do these best-selling titles all have in common? Gone Girl, The Girl on the Train, The Girl with the Dragon Tattoo. It would seem that the inclusion of the word “girl” in your book’s title increases its chances of success. But why?

Well, it actually has less to do with the word “girl,” and more to do with referring to the book’s main character, which one-fifth of all best-selling titles do.

However, naming the book after the character is a bit outdated, a trend that had it’s heyday in the nineteenth century, with novels like Madame Bovary, Oliver Twist and Anna Karenina. This only happens rarely with bestsellers today, though Elizabeth Strout’s Olive Kitteridge is a notable exception.

The modern trend is to have the title describe the character in a few simple words while using the more empowering “the” rather than “a” – which is why The Client sounds stronger than A Client.

Also, when describing your character in the title, don’t be boring.

The algorithm could tell the difference between a stereotypical description and an intriguing one. This is why The Girl with the Dragon Tattoo was recognized as a bestseller, while the less descriptive title, A Girl to Come Home To, did not fare as well.

Since characters are a big reason for our reading books, it’s important to make them enticing and strong.

The algorithm can tell when a book has a strong character by measuring the number of certain words that act as signals or references to the character. For Lisbeth Salander, that popular girl with the dragon tattoo, those words include “Lisbeth,” “Salander,” “her,” “she” and so on.

Now, how strongly this character is written can be recognized by the pronouns and verbs that are associated with these signals.

There’s a reason “need” is the most popular verb in a bestseller; it propels action and moves a story forward. When a character needs something, action follows.

Gillian Flynn's Gone Girl has 163 sentences that use the word “need,” which tells the algorithm that the book has a strongly defined character with a propulsive story, increasing its chance of success.

The bestseller-ometer can recommend novels and gives new authors a chance to succeed.

Have you ever had a tough time convincing a friend to read a book you absolutely love?

Well, this is just one of the many ways the algorithm can make our lives easier.

Let’s say you’re trying to get your book club to pick the new crime novel from one of your favorite writers, but everyone else in the group doesn’t think these books qualify as real literature. With the algorithm at your disposal, you could back up your case with multiple graphs and pages of data that objectively, and let’s hope favorably, compare the crime novel to a thousand other books.

It might be a bit over the top, but you’ll at least have a better reason for your recommendation than simply suggesting that crime books are more exciting.

First-time authors could also benefit from the algorithm, since it’s a better predictor of success than critics.

Even one of the algorithm’s perfect books, David Egger’s The Circle, which scored a 100-percent chance of success, wasn’t a big hit with the critics. Nevertheless, it scored the maximum points for its appealing title and central topics, and its strong first sentence and main character. And, as such, it should come as no surprise that it appeared on many of 2013’s best-seller lists.

Any author will tell you that getting those first books behind you is a struggle. It can take years for a young author to find a voice and strong writing style. The algorithm can be a great teaching tool, guiding these aspirants in the right direction and helping them through the revision process.

It might not be every writer’s goal to join the company of those on the best-seller list, but if popularity is your aim, there’s hardly a better guide to success than the bestseller-ometer.

Final summary

The key message in this book:

Every best-selling book contains clues to its success, and when you examine more than a thousand bestsellers, common patterns begin to emerge, making it clear why these novels are so popular. Based on these patterns, an algorithm has been created that can accurately predict which books have what it takes to become the next big thing.

Actionable advice

Read more books to develop your own algorithm.

Keep an eye on the New York Times best-selling fiction list and pick the book that sounds most appealing to you. As you read it, make a note of everything – the title, the first sentence, the plot, the themes and characters and style. Write down what it is about the book that appeals to you and continue doing this with the next bestseller you read. Before long, you’ll begin to see a pattern behind what you, and millions of other people, enjoy about these books.

Suggested further reading: Wired for Story by Lisa Cron

Wired for Story (2012) takes findings from modern brain science to explain why exactly certain stories suck us in, while others leave us bored and disengaged. By using some fundamental techniques drawn from understanding what makes us tick, writers can craft more compelling stories.

Related Books

13 Things Mentally Strong People Don't Do

13 Things Mentally Strong People Don't Do

By - Amy Morin

Take Back Your Power, Embrace Change, Face Your Fears and Train Your Brain for Happiness and Success
22 minutes read
Non-Obvious

Non-Obvious

By - Rohit Bhargava

How To Think Different, Curate Ideas & Predict The Future
15 minutes read
Hannibal and Me

Hannibal and Me

By - Andreas Kluth

What History’s Greatest Military Strategist Can Teach Us About Success and Failure
16 minutes read
The Creator’s Code

The Creator’s Code

By - Amy Wilkinson

The Six Essential Skills of Extraordinary Entrepreneurs
10 minutes read
The In-Between

The In-Between

By - Jeff Goins

Embracing the Tension Between Now and the Next Big Thing
12 minutes read

Other categories

Share