Researchers have developed an AI-driven system which they assert is able to predict whether or not a non-franchise movie is likely to be profitable, based on 11 years of figures between 2000 and 2010.

The paper Early Predictions of Movie Success: the Who, What, and When of Profitability [PDF], just released in its second edition, outlines the creation of the machine-learning driven Movie Investor Assurance System (MIAS), which departs from a number of similar previous works – including one conducted by Google Research in 2013 – in taking profitability rather than revenue as its index of success.

The authors, Michael T. Lash and Kang Zhao of Iowa University, place emphasis on how radically different big box-office takes look when the production and marketing budgets are deducted, noting that in the 11-year period under study only 36% of Hollywood movies could claim box office revenue that exceeded the production costs.

Top 10 actors by total revenues and total profits.

Top 10 actors by total revenues and total profits

By way of demonstrating the disparity between box office returns and investment success, the paper provides a ‘top ten’ index of actors’ profitability (see image right, though this round-up does include franchise entries) when considered for revenue and then for profit, with only the British-born actress Julie Andrews, who contributed to the Shrek movies in the period under study, appearing in both lists.

The authors excluded franchise entries such as Marvel superhero movies and the Harry Potter series from inclusion in the MIAS dataset, since such movies represent a ‘pre-baked’ profitability related to earlier franchise entries, attempting instead to calculate ‘cold’ properties which producers were hoping to bring to box-office success by combining the appropriate actors and directors.

The MIAS framework

The MIAS framework

The study obtained data by using the APIs made available by the Internet Movie Database (IMDB), which has the most reliable script summaries, and by parsing scraped data from BoxOfficeMojo, which has the most systematic and accurate production and revenue figures. The four criteria employed were ‘Who’ (which stars were used in the productions), ‘What’ (keywords encapsulating the script and thrust of the movie), ‘When’ (release dates), as well as a fourth factor called ‘hybrid’ which seeks to establish meaningful relationships between the other three factors.

MIAS also considers social factors such as dispersion on social networks like Twitter, blog and article posts, number of comments appended to such posts and the tone of the comments.

Returns from the data revealed some non-obvious trends; whilst previous studies and prediction engines downplayed the significance of the ‘star’ power of major league directors such as Christopher Nolan, Ridley Scott and J.J. Abrams, MIAS finds this to be a compelling factor in accurately predicting profitability. MIAS also found that successful actor-director associations across diverse projects (pairings such as Leonardo Di Caprio and Martin Scorsese, George Clooney and Steven Soderbergh – among others) were a signal indicator of profitability even when the genre approached was a new one to both, or there were no other bedrock factors that might guarantee the profitability of the project.

More predictable auguries of success were obtained by considering whether or not an actor was associated already with the genre of the production in question, or letting themselves be cast against type.

‘Our new metric to measure how much expertise a cast has in a specific movie’s genre – Average Genre Expertise – turns out to be positively related to profits (with a coefficient of 0.007 ). Along with the top positive coefficient for average actor-director collaboration profits, they have highlighted the importance of a cast’s expertise and successful collaboration experience in the past.’

The authors contend that the same framework underpinning MIAS could be equally applied to other proposed projects in the field of research papers, grant proposals or operas.