Introduction
Women represent more than half the United States population, yet their presence in news coverage may not reflect this reality.

Women constitute nearly half of the global population, yet their voices and experiences are significantly underrepresented in the news. While the global population is almost evenly divided between men and women, representation in leadership roles across politics and business remains disproportionate. As of 2024, only 13 of the 193 countries recognized by the United Nations have a woman leader, and just 5.4% of CEOs worldwide are women. In the United States, women outnumber men, with 97.1 men for every 100 women, yet they occupy only 10.4% of Fortune 500 CEO positions, a record-high figure achieved only in 2023. This leadership imbalance often translates into a media representation imbalance, where men's voices dominate discussions on politics, business, and societal issues. Historically, news coverage has prioritized the perspectives of those in power, which disproportionately excludes women.
Does the lack of women in leadership justify their absence in news coverage?
​
The underrepresentation of women in news media has broader implications for public perception and policy decisions. Women are directly affected by political and economic policies, yet their voices are often sidelined in the discussions shaping these policies. In 2020, the World Economic Forum reported that only 24% of news sources were women, illustrating the persistent gender gap in journalism. When women’s perspectives are missing from news stories, they translate into gaps around policies and business decisions. For example, Cerulli Associates estimates an 84 trillion-dollar transfer in the United States between generations for the next two decades, the largest in history, and coined the Great Wealth Transfer. Women are estimated to gain 30 trillion dollars from that wealth transfer. If economic policies and news media outlets do not represent this seismic shift in women’s wealth, it could lead to financial strategies, investment trends, and policy decisions that fail to address the unique needs and priorities of women. Media narratives shape public understanding and influence decision-making, making it crucial to assess whether women are fairly represented in business and financial news.
​
There is already work being done in this space that highlights the consistent gender gap in current news media. The Women’s Media Center is a United States non-profit organization that reports on the gender gaps in US media. In 2017, their study showcased the disparities in media, from who was reporting news to who was highlighted in the news. Interestingly, even topics that historically impact women’s lives more, such as reproductive rights and sexual assault, were reported more by men than women. This disparity highlighted that when topics about women were being reported by men, men only quoted women 27-28% of the time. More broadly, in 2019, the World Association of News Publishers created a “Balance Our News” campaign to push forth efforts to close the gender gap in news media.
​
Efforts to address the gender gap in media representation are critical, but a deeper understanding of how language contributes to these disparities is still needed. Beyond who is quoted or featured, the specific words used in news coverage can reinforce stereotypes or subtly influence public perception. Language choices, such as associating women with caregiving or men with leadership, can shape how readers interpret stories, even when coverage appears balanced on the surface. Analyzing the patterns in gendered word usage offers insight into the underlying narratives that persist in journalism and how these narratives may reinforce existing inequalities.
​
This study seeks to analyze how gender is portrayed in Business Insider, a major United States digital news outlet that covers business, politics, and technology, and if there has been progress on closing the gender gap in US news media. Using NewsAPI, the research will examine the frequency of women's representation in news articles in January 2025 and notable patterns in coverage across the articles. Similarly, analyzing the sentiment and framing of headlines can help determine if biases exist in how women’s contributions are perceived.
To explore these issues, this research will address the following key questions:
-
Are women underrepresented in Business Insider’s news coverage compared to men?
-
What types of words are commonly associated with women in Business Insider articles?
-
How does the gender of the author influence the representation of women in news coverage?
-
Which news categories (e.g., politics, business, entertainment) feature the least representation of women?
-
Which news categories feature the most representation of women?
-
How does the sentiment of articles about women compare to those about men?
-
Do headlines featuring women differ in tone, framing, or word choice compared to those featuring men?
-
What proportion of Business Insider’s articles are written by female journalists, and does this impact coverage patterns?
-
Are articles that may not have gendered words still represent a gender?
-
Is there more repetitive news about men than about women?
​
By analyzing gender representation within Business Insider’s reporting, this research will contribute to broader discussions on media bias, equity in journalism, and the role of digital news platforms in shaping narratives about gender. The findings may offer insights into whether Business Insider amplifies or challenges existing disparities in news coverage and how media organizations can work toward more inclusive and representative storytelling.

Exploratory Data Analysis
An exploratory data analysis (EDA) was performed to examine the language used in Business Insider articles. This analysis focused on identifying patterns in word frequency, gendered language usage, and contextual differences in word associations. By visualizing the most commonly used words, the prevalence of gendered terms, and how different words appear together in the content, key trends in Business Insider’s reporting can be assessed. The following graphs provide insights into overall word frequency, the distribution of gendered words and word co-occurrence patterns.
Top 50 most Frequent Words in Articles

Figure 1: This graph displays the top 50 most frequently occurring words in the article content. Terms like "Li," "chars," "ui," "getty," and "images" are likely artifacts from formatting or metadata. Notably, "Trump," "his," and "Donald" emerge as the most common substantive words, which aligns with the political landscape, as a new administration was sworn in during January.
Correlation Heatmap of Top 20 words in Articles

Fig. 2: The metadata was removed, and a heatmap was created to analyze word correlations. As expected, "Trump," "donald," and "president" show a strong correlation, reflecting coverage of his new presidency. Similarly, "business" and "insider" are correlated, as they appear frequently due to the news outlet’s name. Notably, "she" and "her" also exhibit a high correlation, indicating consistent usage of gendered language in the dataset.
Comparison of Articles with and without Gender-related words

Fig. 3: This graph compares the number of articles that contain gendered words to those that do not. The results show that there are twice as many articles without gendered words as those that include them.
Top 50 Most Frequent Words in Articles that only Include Gendered Words

Fig. 4: This graph displays the top 50 most frequently occurring words in the articles' content that contain gendered words. Among the top five words, "his," "he," "Trump," and "Donald" dominate, with "his" appearing most frequently. Notably, "her" is the only feminine-gendered word to make it into the top five.
Correlation Heatmap of Top 20 Words in Gendered Articles

Fig. 5: The correlation heatmap shows similar correlations as above. "Trump," "donald," and "president" show a strong correlation, reflecting coverage of his new presidency. "She" and "her" also exhibit a high correlation, indicating consistent usage of gendered language in the dataset. "Courtesy" and "author" show a correlation, most likely due to crediting image sources or quotes with phrases like "Courtesy of [Author]."
Frequency of Gender-related Words in Articles

Fig. 6: The bar graph shows the frequency of gendered words across all the articles. "His" is the most frequent word and "her" is the second most frequent word. Men terms such as "his," "he," and "him" appear significantly more often than women terms like "she," "her," and "woman," indicating a possible imbalance in gender representation within Business Insider’s coverage.
Percentage of Male gendered words versus Female gendered words

Fig. 7: The pie graph shows that men's words are represented 60% of the time and women's words are represented 40% of the time. However, this can be misleading since this is only the frequency.
Articles that only mention Men versus Articles that only mention Women

Fig. 8: The graph showcases that there are about 100 more articles that only mention men than those that mention women.
Correlation between Gendered Words

Fig. 9: This heat map visualizes the correlation between gendered words in the dataset, indicating how often these terms appear together in articles. Notably, "she" and "her" show a strong correlation, as does "he" and "his," which suggests that gendered terms are often used with the same subjects.
Male vs. Female Mentions per article

Fig. 10: This scatter plot compares the mentions of male vs. female words per article, showing that most articles contain few gendered references overall, indicating further analysis must be done on the content words to analyze gender representation.
Word Cloud of Gender-related Words in articles

Fig. 11: This word cloud highlights the top words across the articles. Since donald, trump and president are the biggest words, it can mean that many Business Insider articles in January were about President Donald Trump. Author might be most related to the articles containing an author credit.