Are female characters pivotal to the plot of a movie?

Four doctoral students worked on the movie-related research with a USC Viterbi professor. (Photo/iStock)


Are female characters pivotal to the plot of a movie?

USC’s Signal Analysis and Interpretation Lab reveals findings on race, age and gender from thousands of pages of dialogue

August 01, 2017 Amy Blumenthal

Roles for women are not always central to the plot of most films, according to a new USC study.

The Signal Analysis and Interpretation Laboratory, which creates automated tools for signal analysis and linguistic assessment at the USC Viterbi School of Engineering, offers findings about race, age and gender.

Four doctoral students working with Professor Shrikanth Narayanan in the Department of Computer Science quantified the tone and sophistication of the language of 7,000 characters and more than 53,000 dialogues in nearly 1,000 scripts. The researchers analyzed the content of a character’s language and their interactions in terms of gender, race and age.

The researchers determined how prominent female characters were to the plot of a movie by analyzing their relationships to other characters in the film.

In their analysis, the researchers segmented the dialogue, categorizing each character as a “node” or hub. When they removed the female character nodes from most movies, the researchers found that the plot and relationships among the characters did not have to be altered significantly. The exception to the rule was in horror movies when women were portrayed as victims. In other words, leaving women out of most films did not cause a disruption to the story.

The numbers game

In the scripts and dialogues reviewed, men had more than 37,000 dialogues; women had just over 15,000. Women portrayed just over 2,000 characters; men portrayed almost 4,900.

The scripts featured seven times more male writers than female writers; almost 12 times more male directors than female directors; and a little over three times more male producers than female producers.

Overall, female characters — regardless of race — tended to be about five years younger than their male counterparts.

If women were in the writers’ room, female character representation on screen was 50 percent higher on average.

The researchers also looked at portrayals across gender, age and race for topics such as emotional arousal (excitement), valence (positive and negative emotion), sex, achievement, religion, death and swearing and for gender-ladenness (dialogue along stereotypical lines).


The authors found that the dialogue of Latino and mixed-race characters had more dialogue related to sexuality. African-American characters had a greater percentage of swear words in their dialogue than other races.


Overall, researchers found that female characters tend to be more positive in valence, meaning they are more positive, but this tended to correlate with language connecting focusing on family values. Those words were mapped by a tool known as Emotiword.

Beyond the volume of dialogue attributed to men, male dialogue contained more words related to achievement and death as well as more swear words than the dialogue scripted for women.


As characters age, the characters on screen appear more sage-like: intelligent, less excited, with less mention of sexuality and more talk of religion.

“Writers consciously or subconsciously agree to established norms about gender that are built into their word choices,” first author Anil Ramakrishna said. “In an ideal world, gender is in an auxiliary fact; it is has nothing to do with the way actors are presented and what they say.”

Narayanan, the senior author of the study, added: “Computational language analysis and interaction modeling tools allow us to understand not just what someone says, but how they say it, how much they say, to whom they speak and in what context, thereby offering new insights into media content and its potential impact on people.”

The team of researchers included Victor R. Martinez, Nikolaos Malandrakis and Karan Singla. The study will appear in the Proceedings of the Association for Computational Linguistics.