The Signal Analysis and Interpretation Laboratory at USC's Viterbi School of Engineering recently released preliminary results of a study on the representation of gender, race, and age in a collection of about 1,000 films. Working with Professor Shrikanth Narayanan in the Department of Computer Science, four doctoral students quantified specific tone and sophistication of language with reference to three groups: gender, race, and age. They examined the language of 7,000 characters in over 53,000 dialogues in almost 1,000 scripts by logging the content and sophistication of a character’s language, as well as their interactions with gender, race, and age. One of the major goals of the study was to determine the extent to which female characters were essential to the plot of a given film.
After combing through the narratives and dialogue of the collection of films, researchers classified each character as a "node" or a "hub," a hub being essential to the story, and a node being inconsequential. When they removed the female character nodes from the majority of the films, the researchers found that the plot did not have to be adjusted drastically. The only instance in which this was not true was when a female node was a victim in a horror story.
Some of the statistical findings relating to gender for the group of almost 1,000 films studied are as follows:
- Men had over 37,000 dialogues while women were afforded only 15,000.
- There were 2,000 female characters featured, while there were 4,900 male characters featured
- Male writers were seven times more frequent than female writers
- Male directors were 12 times more frequent than female directors
- Male producers were over three times more frequent than female producers
- Female characters tended to be five years younger on average than their male counterparts
- If women were in the writer's room, female characters were present onscreen an average of 50 percent higher than if women were not in the writer's room
- Female characters were more likely to be deemed positive due to their language being associated with family values
- Male dialogue involved more words related to achievement and death as well as more swear words than the dialogue of female characters
Other elements were studied, and will be released upon the official presentation of the study.
In addition to gender, race and age were factors studied as well. The dialogue of Latino and mixed-race characters tended to be more frequently related to sexuality. African-American characters were more likely to use swear words in their dialogue than characters of other races. As characters grow older, they tend to be associated with religion, spirituality, and wisdom, as opposed to more fleeting concepts like sex and excitement.
One of the researchers said: “Writers consciously or subconsciously agree to established norms about gender that are built into their word choices. In an ideal world, gender is in an auxiliary fact; it is has nothing to do with the way actors are presented and what they say.”
Narayanan, the senior author of the study, commented: “Computational language analysis and interaction modeling tools allow us to understand not just what someone says, but how they say it, how much they say, to whom they speak and in what context, thereby offering new insights into media content and its potential impact on people.”
Victor R. Martinez, Nikolaos Malandrakis and Karan Singla are among the researchers of the study, which will be presented in the Proceedings of the Association for Computational Linguistics.
Check out the article posted by USC news here.
Join the conversation on Twitter: #nywift | @nywift
NYWIFT programs, screenings and events are supported, in part, by grants from New York City Department of Cultural Affairs in partnership with the City Council, and New York State Council on the Arts with the support of Governor Andrew Cuomo and the New York State Legislature.
Last updated: Aug. 15, 2017