August 13, 2014

What Do Computers Think of Hemingway?

Hemingway.JPGWhat do Computers think of Hemingway?

Part 2 on Successful Literary Style

Posted by David Gulley on Monday, August 11th 2014    

Continuing my series on computer science and literary style, this entry highlights research by Stony Brook’s Professor Yejin Choi. The title of her paper says it all: “Success with Style: Using Writing Style to Predict the Success of Novels.” This is one of the first large, quantitative studies to correlate literary style to publishing success. She and her co-authors refer to their method as “statistical stylometry,” to contrast it with other methods used to ferret out the ‘secret sauce’ of successful novels. Those other methods typically involve analyses of plot and character. Nothing wrong with that, but Choi’s methods are more rigorous and her data base included 800 titles—some classics plus a cross section of others, drawn from many genres and with varying readership. Her academic paper is available on the web and some news articles are found here:

Some of her findings confirm widely held beliefs, but others are counterintuitive. For example, she finds that complex sentence structure correlates with success while the overuse of verbs and adverbs doom popularity. As you’d expect, the use of clichés doesn’t help, either. Some of her findings made me scratch my head, and may be an artefact of the sample. For example, thousands of students are required to read ‘heavy’ tomes by long-dead authors, and few modern authors or readers eagerly explore these literary thickets. “We made an unexpected observation on the connection between readability and the literary success—that they correlate into the opposite directions,” she says.

Having studied her paper, I’d say it is fascinating but heavy going. Still the literati should be heartened by her emphasis on voice and style, given the prevailing zeitgeist that “it’s all about the story and not the words.” The trick is in applying the lessons properly—whether you are a writer, a literary agent, or a publisher. The Internet is full of well-intentioned advice by qualified people that promotes sameness as a guide to publishing success, and it is intriguing to see something different.

What if Amazon or Barnes and Noble could inform a book buyer: “You loved So-and-So’s voice; you may be equally fond of X?” What about all those books that were never sold, almost never sold, or quickly remaindered because no one could figure out how to market them? Consider the dark story of Confederacy of Dunces, turned down countless times because there was “no market” for that story, setting, or characters—yet readers and prize-givers fell in love with the book through its flair. And what about your own appetite, Dear Reader, for an excursion into something entirely different, where you find an authorial voice you love?

In a recent post, the literary agent Bryony Woods remarked: “It’s worth saying that all of my favourite books are ones I did not know I was looking for, but once I had found them I simply could not imagine my life without them.” I couldn’t agree more.

Is Professor Choi following her own advice? In her interview she replies to a question with the answer: “We conjecture that the conceptual complexity of highly successful literary work might require syntactic complexity that goes against readability.” Now that’s a comment I doubt sprang spontaneously to her lips, and it confirms that in academia, at least, readability and publishing success remain negatively correlated, as always. Some things never change.

Professor Choi conducted her research in a language that is not her mother tongue, which humbles me. Matching readers to books is usually done by genre, premise, and story line. Yet everyone agrees the author’s voice is as much an element as anything else in making a good match. Research along these lines might interest the publishing community, and—who knows?—may ultimately help the fatigued reader avoid something familiar in favor of a love match with a new author.


- See more at:

Showing 1 reaction