The Underrepresentation of Female Founders in Tech and an Examination of Employee Morale

An exploration of founder gender representation in the tech industry and internal perceptions in male and female-founded tech companies.

Emma O'Neil
15 min readApr 12, 2021
Photo by Ales Nesetril on Unsplash

Background

It is widely known that the technology industry is highly male-dominated. But to what extent are females actually underrepresented in the tech industry, particularly among the highest performing, largest tech companies? According to a 2019 Silicon Valley Bank report, only 28% of startups have a female founder, and companies with all-women founding teams are given less than 3% of all US venture capital dollars.

Aware of this general gender disparity in pursuing entrepreneurship, I became interested to examine what this disparity looks like in the tech industry and further understand if there are inherent differences in employees’ sentiment that could shed light on the work environment foundationally cultivated by male vs. female founders. More personally, as someone passionate about technology and entrepreneurship, I wanted to better understand the reality of this industry to motivate greater female participation in the industry and spark discussion about the impact of female founders on a company’s success and employee morale. My original hypothesis in regards to sentiment was that perhaps the greater female representation would translate into greater employee satisfaction at work due to the existence of a culture that is more inclusive. However, female founders may perform worse due to lack of investor support and other considerations.

Thus, there were three main areas I wanted to investigate:

  1. How underrepresented are female founders in the tech industry when looking at the most top-performing, large companies (which presumably carry a significant influence in the industry)? How does representation differ across the country?
  2. How does the performance of tech companies with female founders compare to that of male founders? How do the descriptions of the companies themselves differ between male and female-founded companies?
  3. How does employee review sentiment differ if at all between female and male-founded tech companies? What does the text of the reviews say about these companies?

About the Data

Exporting data from Crunchbase, I obtained profiles for 875 companies. These companies were obtained by filtering specifically to companies that are in the United States, are in the industry group of “Artificial Intelligence,” “Information Technology,” “Internet Services,” or “Software”, have an operating status of active (as opposed to closed), have revenue of a minimum of $1M (up to $10B+), and have a minimum of 5000 employees (up to 10000+ employees). The dataset contained information about headquarter location, revenue range, company description, in addition to other features that were not used for this analysis. (While I was hoping to incorporate some of the date variables including founding date and last funding date in the analysis, too many of the rows had NAs for these variables to be of use.) There is a relatively new feature in Crunchbase called Diversity Spotlight highlighting female-founded, female-led, as well as minority-founded and minority-led companies. However, given that this is a new feature, there were quite a few companies in the dataset that lacked the label even though they were female-founded. Thus, I actually went through all 875 entries to identify the founder’s gender. (For some companies, the founder’s name was included as a feature in the dataset; otherwise, I had to look it up.) If there were multiple founders, as long as at least one founder was female, I labeled the entry as female-founded.

For the sentiment analysis, I scraped employee reviews from Indeed using Chrome’s Web Scraper. I first identified all female-founded companies in the first dataset with $100M+ revenue (a subset of the female-founded data). A few of these companies actually did not have any reviews on Indeed, so I ultimately ended up with 24 female-founded companies for scraping purposes. To identify the male-founded companies I would gather employee reviews for, I took a random sample of companies from the first dataset (since there were a lot more male-founded companies with $100M+ revenue) to also ultimately obtain 24 male-founded companies for scraping purposes. For each company, I scraped the first page of reviews (by default, the most recent reviews). I ended up with 438 female-founded reviews and 442 male-founded reviews.

Female Founder Representation in Tech

First I wanted to get an overall sense of the big picture. We can see that companies with female founders make up only 5% of the tech companies with revenue $1M+ and 5000+ employees. This was not all that surprising to find considering the lack of female founder representation among start-ups but it was still somewhat striking to observe this given the focus here on the highest-performing (based on revenue), largest companies, which can perhaps be considered a measure of influence in the tech sector.

How does the female founder representation differ in different parts of the country? To answer this, I used the location of the headquarters of each company and grouped the country into the following regions: Western US, Northeastern US, Midwestern US, Southern US. (This required some data cleaning given that the headquarters region variable had inconsistent entries in that some of the entries included longer, more specific location descriptions separated by commas (e.g., “San Francisco Bay Area, Silicon Valley, West Coast”).) While companies can have multiple offices across the country, their headquarters tend to be the largest and correspond to the area of the country where they have the greatest overall influence.

We can see that the South and the West have the greatest percentage make-up of female founders. It is interesting to see that females make up a greater percentage of founders in the West than they do in the Northeast, considering that NYC is a key location for business in the US. Perhap this suggests that the West is a more progressive and welcoming environment for female founders to establish their headquarters. Thus, female founders may want to opt to establish their headquarters in the West where they may feel more “at home.” However, across the board, we can see that the representation for female founders is quite low, with no region having more than 7% of tech founders as female.

Performance Breakdown and Company Identity

Next, I wanted to understand how these male-founded vs. female-founded tech companies differ in terms of their performance. I used the revenue variable as an indicator of performance, examining the percentage of tech companies for each gender that fell into a certain revenue range.

We can see that the performance of male-founded tech companies lies somewhat more at the extremes with a greater percentage of companies in the $1B+ range and below the $50M range. Of the female-founded tech companies, a greater percentage have revenue of $10B+compared to male-founded companies and a lower percentage have revenue of $1M-$10M compared to male-founded companies. This result did not clearly align with what I had expected. We see that overall, female-founded tech companies do not appear to perform worse in terms of revenue compared to male-founded tech companies. This serves to reinforce the ability of female-founded companies to have financial success. They can compete and be successful.

Diving into the natures of the companies themselves, I wanted to do some text mining using the tm package in R on the full descriptions for each company to get a sense of what the companies do, whether male and female-founded companies have different or similar focus areas, and understand how they may characterize themselves differently. I created a corpus for each group and generated a wordcloud for each based on the most frequent 50 words.

Left: descriptions for female-founded companies; Right: descriptions for male-founded companies.

It is not surprising that words like management, solutions, technology, products, and services appear frequently for both groups. However, there are frequent occurrences of other words in female-founded descriptions that do not appear as such in the male-founded descriptions. For example, people is among the top 15 most frequent words for female-founded companies but does not appear in the most frequent 50 words at all for male-founded companies. care is another such word that does not appear in the most frequent words for male-founded companies but does for female-founded companies. helps (or help) appears relatively more frequently in female-founded company descriptions than those for male-founded companies. These words perhaps suggest a particularly people-centric aspect of many of these businesses. It is also interesting to see the juxtaposition of words like beauty and hardware in the female wordcloud.

In the wordcloud for male-founded companies, we can see that the word global (as well as world) appears relatively more frequently compared to that for female-founded companies. We also see leading appear among the most frequent words unlike for the female descriptions. Perhaps most interesting, we see the word united appear frequently in descriptions (actually 108 times in all the male-founded descriptions). I also noticed the word research, so taken together, perhaps some of these frequent words suggest the global leadership of these businesses and their scale.

Perhaps some of these differences in company descriptions could help male and female-founded companies counterbalance their marketing efforts. Male-founded companies should perhaps consider the benefits in some circumstances of presentation as a people-centered company with considerations including care and aid. Female-founded companies may perhaps want to consider the importance of expanding their reach globally as well as research and development for being at the forefront of innovation. Thus, in essence, founders can re-examine how they present their companies and products considering the words that are associated more with male and female-founded companies to leverage the value of both.

Diving into Employee Reviews

Having explored the core company data, I next wanted to get a sense of how employees feel working at these companies and how sentiment may differ overall between companies that are male vs. female-founded. I scraped the employee reviews (including the review text and the star rating) for 24 female-founded companies and 24 male-founded companies from Indeed, gathering 438 reviews for female-founded tech companies and 442 reviews for male-founded tech companies (see “About the Data” above).

Before text mining, I obtained a summary of the ratings for male vs. female-founded companies to get a sense of how the overall ratings differed.

+---------+--------+---------+--------+-------+---------+-------+
| Founder | Min | 1st Qu. | Median | Mean | 3rd Qu. | Max |
+---------+--------+---------+--------+-------+---------+-------+
| F | 1.000 | 2.000 | 3.000 | 3.215 | 4.000 | 5.000 |
| M | 1.000 | 2.000 | 4.000 | 3.457 | 5.000 | 5.000 |
+---------+--------+---------+--------+-------+---------+-------+

We can capture this more visually with this bar plot.

Conducting a one-sided two-sample t-test, I discovered that the mean rating for male-founded companies is statistically significantly greater than the mean rating for female-founded companies (given the p-value of 0.006). That is, the average employee review rating for male-founded companies is significantly higher than the average employee review rating for female-founded companies. This was a bit disheartening.

However, this can serve to incentivize female founders to understand why they may be receiving lower reviews, perhaps suggesting the importance of surveying employees to see how the company’s internal work environment and culture could improve to bring about better ratings. After all, employee ratings can be an indicator of employee satisfaction, and low ratings can reflect poorly on the internal quality of the company. Furthermore, employee ratings may certainly correlate with differences in the larger public perception of such companies and of female-founded companies in general, so addressing this would be important in challenging the gap in female founders as well as public perception.

I then went deeper into the data, at the individual word level, to understand employee sentiment.

Employee Sentiment Analysis

Using the tidytext package and the afinn lexicon, I tokenized the review text for each group and computed the sentiment, visualizing the distributions in the violin plot below. We can observe that the distributions for the employee sentiment are very similar for female-founded and male-founded companies.

+---------+---------+---------+--------+-------+---------+--------+
| Founder | Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
+---------+---------+---------+--------+-------+---------+--------+
| F | -13.000 | 1.000 | 4.000 | 4.477 | 8.000 | 31.000 |
| M | -21.000 | 1.000 | 4.000 | 4.345 | 8.000 | 32.000 |
+---------+---------+---------+--------+-------+---------+--------+

We can see that the average sentiment for the female-founded reviews is a bit higher than that for the male-founded reviews. However, I conducted a one-sided two-sample t-test on the afinn sentiment scores (similarly to before but this time with the alternative hypothesis being that the average female-founded sentiment is higher than the average male-founded sentiment) and found that there is not evidence to suggest that the average review sentiment for female-founded tech companies is higher than the average review sentiment for male-founded tech companies (given the p-value of 0.379). It is perhaps interesting that the ratings differed significantly yet we do not have evidence for a difference in the sentiment contained in the review text.

I next wanted to get a deeper more qualitative sense of the words used in these employee reviews. I first used the bing lexicon to explore the most common positive and negative words for both groups. As could be gathered with the afinn analysis (given positive average sentiment for each group), positive words appear more frequently than negative words for both groups. For female-founded reviews, the top negative words are hard, bad, and lack, whereas for male-founded companies, the third most frequent word is issues. Although unclear, perhaps this suggests that lack of opportunity is more of a problem in female-founded companies whereas general challenges of work are more apparent in male-founded companies. Thus, the contexts for these most common negative words (as well as positive words) are less apparent. Female-founded and male-founded reviews share the same top three positive words.

Using the NRC lexicon, I continued my analysis by looking at the distribution of emotions across the two groups. For both groups, we see that the emotional distribution is skewed towards more positive emotions, including trust, anticipation, and joy (as well as positive). The distribution of emotions in reviews for male and female-founded tech companies is nearly identical. Thus, whether a tech company is female-founded does not seem to be associated with any difference in the emotional sentiment expressed by employees in reviews for the company.

Text Exploration

Using the R tm package once again, I created corpuses for the review text associated with companies with female founders and the review text associated with companies with male founders. Unsurprisingly, internal reviews from employees on Indeed are centered on words defining a work environment, with the most common words including work, company, management, and job which can be seen in the bar plot below as well as represented in a frequency wordcloud. This is true for both groups. Even a word like culture appears nearly just as frequently for both groups (appearing with a frequency of 34 for female-founded reviews and 36 for male-founded reviews). Perhaps the most interesting insight from this exploration of the top 10 most frequent words for each group is that great appears more frequently in male-founded reviews (164) than female-founded reviews (125).

Given that looking at the most frequent words overall for each of the groups did not suggest any differences in words expressed in employee reviews for female-founded and male-founded tech companies, I wanted to explore words that were used relatively frequently in female-founded reviews but especially sparsely if at all in male-founded reviews and vice versa. To do this, I created a Document-Term Matrix for each group (female-founded and male-founded reviews) and used removeSparseTerms to create a sparse Document-Term Matrix for each group containing a bit over 200 words each. To capture the words used much more frequently in female-founded reviews than male-founded reviews, I removed the words of the male-founded sparse Document-Term Matrix from the female-founded corpus and examined the 40 most frequent “unique” female-founded words. To capture the words used much more frequently in male-founded reviews than female-founded reviews, I applied the same procedure and examined the 40 most frequent “unique” male-founded words. The corresponding wordclouds are visualized below.

Left: differentiated words in reviews for female-founded companies; Right: differentiated words in reviews for male-founded companies.

Even though the frequency of any one of these words in their respective corpuses is less than 21, I still found this to offer insights, especially as inspiration for further examination of such findings in a larger dataset of reviews, perhaps in a dataset of thousands or tens of thousands of employee reviews. It was striking for me to see some of the unique female-founded reviews words including favoritism, pretty, communication, supervisors, change, difficult, learning, and development. This perhaps sheds light on challenges of communication and opportunities for learning or perhaps lack thereof. This latter uncertainty between lack of opportunities or a wealth of opportunities highlights a limitation of unigram analysis, prompting motivation to explore bigrams in a larger dataset (as discussed in “Further Research”). On the other hand, some of the unique male-founded reviews words include meetings, union, leave, holidays, professional, busy, issues, away, and big. This perhaps suggests a particularly professional, fast-paced, and hierarchical nature among these male-founded companies.

Conclusions

The following conclusions can be summarized from this data exploration of the representation of female founders in tech and internal review sentiment:

  • Female founder representation among the highest-performing, largest tech companies in the United States is as low as 5%.
  • While female founder representation is lacking across the country, there is the largest gap in the Midwestern and Northeastern United States. Moreover, the Western US appears to be the most welcoming of female founders, so a female founder seeking a location for her company’s headquarters may be best apt to choose a location in the West.
  • The performance of the highest-performing, largest female-founded companies in terms of revenue seems to be comparable to that of the highest-performing, largest male-founded companies. This is a powerful message that female-founded companies can be successful like male-founded companies and is perhaps a point that can be emphasized with investors as well.
  • Employee sentiment does not seem to differ between female-founded and male-founded companies, although there is a difference in the overall star ratings employees provide. As for the difference in the star ratings, female founders are incentivized to understand why there are lower review ratings. There may be value in a female founder surveying employees to understand what could bring about better ratings. This is especially important to consider in order to ameliorate perception of female-founded companies.
  • Given the lack of difference in overall revenue performance and review text sentiment, increasing female founder representation through education, opportunity, and support are justified, even if it is simply for the sake of making the face of tech reflect the diversity of identity, the diversity of perspectives, of the world at large.

Further Analysis

It would be worthwhile to explore a larger dataset of review data and gather unique bigrams in reviews for female-founded companies. This was not really feasible on the dataset of this size, as I noticed that any bigrams occurred with a maximum frequency of 6 in the review text for female-founded companies (and the vast majority of bigrams had frequencies less than or equal to 2).

An additional area of exploration could be examining review sentiment in tech companies with female CEOs. I would hypothesize that there may be a greater difference in sentiment between female-led tech companies and male-led tech companies than we saw here, as there is likely to be more of a powerful influence from the current leadership on employee morale than the legacy of a female or male founder (if no longer the CEO).

About Me

I am Emma O’Neil, a junior pursuing a dual degree at The Wharton School and the College of Arts and Sciences at the University of Pennsylvania, studying statistics, cognitive science, and computer science. I am passionate about the intersection of technology, design, and entrepreneurship. You can find more about me here.

--

--