Mental Health Support on Reddit

A text analysis of the social media mental health community: an exploration of mental health subreddits and comparison with Instagram mental health influencers.

26 min readMay 4, 2021

Background

According to Mental Health America, even before COVID-19, the prevalence of mental illness among adults was increasing, and suicidal ideation among adults has been on the rise. People facing mental health challenges are increasingly turning to social media for support and advice. Aware of the significance of mental health issues and the increasingly prevalent outlet of social media for perspective, conversation, and support, I became interested to explore this space of social media for mental health. I wanted to understand what kind of community Reddit offers people struggling with mental health issues. What posts drive “support” from others, and what support is most “valued” in the community? More personally, I have volunteered at crisis hotlines including Crisis Text Line for several years and have found that there are many people who seek out online support given the lesser threat of vulnerability behind a screen. As someone passionate about mental health support and advocacy, I wanted to better understand the Reddit community in the context of mental health issues and the textual nature of the impact of mental health influencers on Instagram. Considering the significant unmet need for mental health support, I wanted to shed light on the kind of open social community Reddit can provide for these issues, the guiding words of mental health influencers, and the posts and feedback users resonate with to inform how we can consider supporting each other. In regard to the fundamental analyses, my original hypothesis was that more negative posts and more positive comments would receive greater karma on Reddit given that negative posts may draw more attention in the community and resonate with individuals, while positive comments would be registered as productive and perhaps helpful. Furthermore, I hypothesized that the mental health subreddit comments would have comparably more negative sentiment relative to Instagram influencers due to the empowering nature of these personas.

Moreover, there were six main areas I wanted to explore:

What is the sentiment of posts and comments in Reddit mental health communities and what are the overarching topics that posts and comments fall into?
On a more granular level — how does the sentiment of subreddits particularly aimed at peer help differ from other subreddits (e.g., depression_help vs. depression)? How do the most frequent words for subreddits like depression compare to the most frequent words of texters reaching out to Crisis Text Line?
What explains “social support” (both in terms of net upvotes/downvotes and number of comments) for posts in these subreddits? Is the sentiment of posts a statistically significant predictor of the social support a post receives?
What explains which comments are most socially supported on these subreddits? What is the sentiment? Do they relate to treatment? Are they prescriptive?
What is the sentiment and caption content of Instagram mental health influencers? Which captions are associated with posts with the most likes for each influencer? How does the sentiment and text of these influencers compare to the sentiment of posts on mental health subreddits?
What proportion of posts on the UPenn subreddit are related to mental health concerns?

About the Data

Reddit — I used RedditExtractoR to obtain posts and comments from 14 mental health subreddits: /r/depression, /r/alcoholism, /r/Anxiety, /r/BipolarReddit, /r/mentalhealth, /r/MMFB, /r/socialanxiety, /r/SuicideWatch, /r/EatingDisorders, /r/selfharm, /r/OCD, /r/addiction, /r/Anxietyhelp, /r/depression_help. I gathered these subreddits by taking a large and active subset of those listed here and here. I used the get_reddit function for each subreddit, limiting the search results to that subreddit and searching a term intended not to differentially influence the returned results (e.g., “depression” for the depression subreddit). I tuned the cn_threshold (minimum number of comments on posts returned; either 1 or 2 for each subreddit in the analysis) and the page_threshold (number of pages of returned results) for each subreddit in order to obtain a somewhat balanced number of posts across different subreddits (between about 20 and 40 posts) in order to avoid a particular subreddit accounting for a largely disproportionate share of the posts and comments contained in the aggregate mental health subreddits data. Note that the Reddit API only allows extraction of search results sorted by comments or by new posts, so since I wanted a significant number of posts rather than merely very few posts with many, many comments, I sorted the search results by the new posts. Thus, all these posts reflect the most current state of these mental health subreddit communities. In total, I obtained about 410 posts and corresponding 2,450 comments. I also gathered a sample of posts on the /r/UPenn subreddit (about 155 posts) to explore the proportion of posts related to mental health.

Instagram — I obtained captions of Instagram posts from 35 mental health influencers using Chrome’s Web Scraper, totaling 580 captions. All of these accounts are publicly available. I avoided scraping from influencers who rarely included comments with posts or had extremely short comments given that my analyses were centered on the text in posts’ captions. All of these influencers have 30k+ followers. In addition to scraping the captions, I also scraped the influencer’s Instagram handle and the number of likes corresponding to a given post and caption. I selected these influencer accounts primarily incorporating influencers from the Top 125 Mental Health Instagram Influencers most followed in 2021, 14 Mental Health Instagram Accounts You Should Follow ASAP, and Top Mental Health Influencers on Instagram. These mental health influencers range from certified psychologists to mental health advocates to digital creatives.

Google Trends — To provide further motivation for the discussion of mental health support, I used gtrendsR to obtain trends over time for a few relevant Google search queries.

Google Search Trends

To better understand the context of this analysis, I wanted to explore Google search trends to see if particular search queries have been increasing over the last year or so. In exploring these queries, I considered that there are perhaps three different types of search queries related to mental health: self-harm and suicide-specific queries (e.g., self-harm, suicide-related), general mental health queries (e.g., anxiety, depression), and help-seeking queries (e.g., mental health support, querying hotlines). I summarized the trends in three such search queries in the past year.

We can see that while there is significant short-term fluctuation in the search hits for “self-harm”, there was a rise around the time that shutdowns began with the onset of the COVID-19 pandemic. For the “anxiety” keyword search term, there appears to be a very subtle continuous increase over time. As for “mental health support”, perhaps the most relevant of the three terms for this analysis, we can see that an increasing trend of search hits for the term, although there is greater stabilization in the past six months or so. Such increases show the relevance of mental health concerns and ultimately, the need for outlets and resources for mental health support and empowerment.

Diving into Reddit data

In exploring the Reddit data for several mental health subreddits, I conducted text and sentiment analysis to understand the language, topics, and emotional composition of posts and comments. After using the tm package to clean and tokenize the text, I visualized the most frequent words.

Left: posts on mental health subreddits; Right: comments on posts on mental health subreddits.

We can see that feel ,just , and like are rather frequent in both posts and comments. It is perhaps interesting that dont is a frequent word in both posts and comments, perhaps reflecting the frank, open nature of the Reddit community at its heart. It is insightful that can is the top most frequent word in the comments, as this might seem to suggest a level of empowerment to a certain extent (also considering words like try and better). We can see that ive (I’ve) appears relatively much more frequently in posts than in comments (6th vs. 27th most frequent word respectively), which may suggest that posts are more self-centric, while mental health subreddit comments are perhaps more concerned with community connection. Words like please in comments also perhaps suggests the greater interactive nature of the forum compared to what we might see in the case of Instagram in the context of influencers, as the support in the latter case is largely one-sided.

Next, I wanted to explore the topics in posts and comments using LDA. Instead of seeing whether the topics of the aggregate data matched the existing subreddit divisions, I wanted to explore something I thought was more meaningful: what are the overarching topics that span across mental health subreddits? That is, what topics can we discern from the content related to diverse mental health issues? Below are the results from training topic models for posts and comments using 4 topics and illustrating 15 words representative of each topic.

Examining the topics for posts, I judged the post topics to be related to 1) time, 2) one’s story and action, 3)perspective and feeling, and 4) overwhelming mental state. As for the topics for comments, the topics seem to correspond to 1) feeling and connection, 2) perspective and direction, 3) confronting the mind, and 4) orientation of time. I found the fourth topic of posts to be particularly insightful, as it seems to suggest the overwhelming nature of mental issues that span across the subreddits and perhaps drive the desire for support and connection in the community.

I also found the second and third topics of comments to be interesting. The second topic (represented by words like help, please, try, better, thank) suggests a unique support community for mental health issues on Reddit given its structure as a forum. And the third topic is represented by words such as anxiety, thought, and brain, which seem to correspond to those concerns in the fourth topic of posts to a certain extent. This may appear to suggest the reflective nature of Reddit comments, which is known to be a beneficial means for people to gain clarity around their own struggles.

Sentiment

Having explored the words and topics of posts and comments, I wanted to get a sense of the emotional composition of posts and comments and compare them as well. I used the NRC lexicon to examine the distribution of emotions for posts and comments.

Apart from the polar co-existence of negative and positive emotion in both posts and comments, we can see that anticipation and trust are rather predominant emotions expressed in these subreddits, which perhaps points to the openness and engagement on the platform. As for the relative frequencies of positive and negative emotion, we can see that which is most dominant is flipped for posts and comments, i.e., positive emotion is most frequent in comments, while negative emotion is most frequent in posts. This difference is ultimately what I had expected to see, and it would seem to suggest that people posting in these subreddits are in a distressed mental state and seeking out support and consolation, while commenters are perhaps trying to maintain a level of optimism and view positive feedback as a better support mechanism than reinforcement of negative emotions. Additionally, I found it interesting to find that fear, as well as sadness, are more dominant in posts than comments. While these are emotions with a negative connotation, it actually made me hopeful to see these emotions expressed, as the Reddit community is seeming to serve as an outlet for people to express their pain and doubts in a way that is often otherwise intentionally filtered on other social media sites (that are not effectively almost entirely anonymous).

Breaking Down the Subreddits

Some of the subreddits I extracted threads from seem to be aimed at receiving help and motivation to a greater extent than others, including /r/depression_help and /r/Anxietyhelp. Thus, while /r/depression advertises the community as “peer support for anyone struggling with a depressive disorder”, /r/depression_help “provides a platform for you to get the support, advice, inspiration and motivation you need to make the best of your life with the mental illness.” Thus, I wanted to get a taste of whether the depression_help subreddit has any more positive sentiment than the depression subreddit, perhaps suggesting differing uses for the two subreddits/differing means of support. Using the tidytext package and the afinn lexicon, I tokenized the text for posts and comments separately and computed the sentiment, visualizing the distributions for both subreddits in the violin plots below.

We can see that there is a bit more variation in the sentiment on the depression_help subreddit for both posts and comments. Although the sentiment of posts is very similar to that of the depression subreddit, the comments in the threads on the depression_help subreddit are slightly more positive overall, and the distribution has a larger spread as well. I ran a one-sided two-sample t-test to determine whether there is evidence of statistically significantly more positive sentiment in the comments for the depression_help vs. depression subreddits. Given the very low p-value (certainly far lower than a significance level of 0.05), there is evidence that the comments on the depression_help subreddit are more positive than those on the depression subreddit.

Given this discovery, perhaps if people are seeking more emotionally positive support on Reddit, they should turn to such subreddits like /r/depression_help, /r/Anxietyhelp, etc., while perhaps those that are seeking reflective support resonating with negative emotion should look to the main subreddits like /r/depression, /r/anxiety, etc. where there is also likely to be more active engagement given the larger number of community members.

Comparison with Crisis Text Line Conversations

Given that crisis hotlines are another avenue for mental health support and also given that Crisis Text Line in particular is a texting service such that it is perhaps most comparable (as opposed to e.g., call conversations) to the medium of support Reddit offers, I wanted to compare their textual content. I discovered Crisis Text Line (CTL) makes the top 35 words used in the context of a particular crisis topic publicly available. So how do the most frequent words in the /r/depression subreddit compare to the most frequent words in CTL conversations related to “depression”? How do the most frequent words in the /r/SuicideWatch subreddit compare to the most frequent words in CTL conversations related to “suicidal thoughts”?

Left: based on top 35 words for posts on /r/depression subreddit; Right: top 35 words for Crisis Text Line conversations related to crisis topic “depression”.

Left: based on top 35 words for posts on /r/SuicideWatch subreddit; Right: top 35 words for Crisis Text Line conversations related to crisis topic “suicidal thoughts”.

We can see that posts in both the /r/depression and /r/SuicideWatch subreddits tend to be more focused on the self, while the CTL conversations have more frequent occurrences of words like people and friends and mom. As a Crisis Text Line volunteer myself, I know that this may be in part related to the fact that we as volunteers do not only want to validate texters’ feelings but also want to explore avenues for support in their own lives. Thus, given this context, such differences make sense and perhaps highlight the importance and value of such hotline services that spark these kinds of conversations in the context of such mental health issues.

It is perhaps interesting that in both the subreddit posts and the Crisis Text Line conversations, we see words in the extreme, e.g., anything, never, every. In the subreddit posts, there is a large presence of the word want, while in CTL conversations, there is a focus on need. Additionally, the words in the subreddit posts appear to be more negative in nature, while those in CTL conversations interestingly seem to be more associated with a greater sense of optimism and connection.

Below is a demonstration of an R Shiny app I created to interact with a sample of the subreddits examined in this analysis. Each wordcloud visualizes the most frequent words in the post text data corresponding to that subreddit. Visit the app here to interact with it yourself. Select the subreddit of interest and hover over the words to discover their frequencies in the dataset.

Explaining Post Karma

Having explored the text and sentiment of subreddit posts, I next wanted to understand the power of sentiment for explaining the social support a post receives. I decided to use a post’s score (its net upvotes minus downvotes) as well as the number of comments on a post as measures for the extent of social support a post receives. These thus defined my two response variables of interest for two linear regression models I would explore.

In terms of predictor variables, I used the afinn numerical sentiment value of each post and I explored other predictor variables. Seeing that I obtained a model with insignificant predictive power, I explored other variables as well. I considered whether any first-person pronouns (I, me, my, mine, myself) were present in the post and whether any second and third-person pronouns (you, your, yours, yourself; his, himself, he, hers, herself, her, she) were present. I thought the presence of more community-oriented pronouns (i.e., not first person) might perhaps draw in more social support. I also considered whether any questioning words (what, where, when, which, who, whose, why, how) were present, as I thought that these might elicit more feedback and connection with the Reddit community. I considered whether words related to suicidality would be associated with greater “social support.” In detecting this, I used both standard words indicative of high risk such as “die”, “cut”, “suicide”, and “kill”, as well as a few words Crisis Text Line has found to be particularly predictive of suicidality (including “ibuprofen”, “advil”, “excedrin”, “800mg”, and “acetaminophen”), as their machine learning algorithm has learned that when a texter uses such words, the conversation is more likely to lead to an active rescue. I additionally used the length of the post in words and the source subreddit of the post as independent variables. Discovering a lack of significance in the model, I also assigned each post the topic from the post LDA model from above it is most closely associated with (based on topic probabilities) and used this factored variable as a predictor in the regression model. However, we can see that sentiment, nor any of these variables, has truly statistically significant explanatory power for a post’s score in mental health subreddits. Below are the regression results.

For the second regression for the number of comments for a post (my second measure of social support for posts), we can again see that sentiment, as well as the other variables, do not have explanatory power. Below is the regression output and the correlation matrix I used to check that my independent variables were not highly correlated with one another (i.e., multicollinearity). The association with the greatest magnitude was -0.43 (between emotion and post length), which was not large enough to cause concern. Thus, based on these regression results, there is no evidence to suggest that posts with any more positive or negative sentiment are associated with different scores or a different number of comments.

Left: results for regression of number of comments for a post; Right: correlation matrix for independent variables.

Ultimately, this lack of significance, and particularly the lack of power of a post’s sentiment to explain the social support it receives, perhaps speaks to the complex nature of mental health issues and the challenging nature of this predictive problem.

Explaining Socially Supported Comments

I next wanted to explore whether sentiment could explain the score a comment gets on these mental health subreddits. Ultimately, I wanted to understand what kinds of comments are most socially supported in the Reddit mental health community (and which are not). In tackling this regression problem, I wanted to try using the bing lexicon for computing the overall sentiment of comments. I computed the sentiment for each word of every comment, computed a total for the number of positive words and the number of negative words for any given comment, and then assigned a new variable (i.e., the overall sentiment for any given comment) taking the value “positive” if there were more positive than negative words in a comment or the value “negative” if there were more negative than positive words in a comment. Beyond sentiment, I also considered whether perhaps comments containing words about treatment (e.g., “therapy,” “medication”, “diagnosis”, “doctor”, “psychiatrist”, “recovered”) receive more support, using string detection. I also wanted to see if more prescriptive comments (comments that perhaps use words like “should”) are less socially supported (or alternatively, more socially supported). Two other independent variables I included were comment length in words as well as the topic from the comment LDA model from above that each comment is most closely associated with (based on topic probabilities) (similarly to the regression for posts). Below are the regression results and the correlation matrix I plotted to check for multicollinearity. Given that the correlation of the largest magnitude is 0.35 (between prescriptive and comment length), this was not an issue.

Left: results for regression of comment score (for comments); Right: correlation matrix for independent variables.

Once again, we can observe that there is a lack of evidence that sentiment has any explanatory power for the score a comment receives on these subreddits. However, given that the comment length coefficient is statistically significant, a longer comment is associated with a higher comment score. Perhaps longer comments are more thoughtful and demonstrate consideration for the person posting and validation for his/her outreach to the community of Redditers. Perhaps this can shed light on the value of fully engaging when it comes to conversations about mental health as well as the Reddit community’s appreciation/social support for thoughtful contributions.

As for the levels of the lda_closest_topic variable, we can see that comments most closely related to the topic of perspective and direction are associated with lower comment scores relative to comments that are related to the topic of feeling and connection (the baseline). I found this result to be quite interesting and suggestive of the nature of the Reddit community. It seems that mental health subreddits are accepted by the community to be a place for more ranting and community connection/validation rather than a place where perspectives and support in the form of guidance are highly valued.

Additionally, we can see that comments most closely related to the orientation of time are associated with higher comment scores relative to comments that are related to the topic of feeling and connection (the baseline). Perhaps this reflects the universality of time and the way that it can ground us.

Instagram Mental Health Influencers

As social media expands its cultural dominance, influencers are taking over the world, and there is no shortage of them in the sphere of mental health. Having explored the mental health space on Reddit, I wanted to see how the text and sentiment of captions of Instagram posts shared by over thirty major mental health influencers compare to the sentiment of posts on mental health subreddits. Also, are more positive or negative captions associated with the most-liked posts?

I first wanted to explore the most frequent words used and the topics of these captions, once again using the tm package and an LDA topic model.

As for the most frequent words, we can see that can is the most frequent word (with 443 occurrences in the 580 total captions). It is interesting that this was also the top most frequent word in the comments of posts on the mental health subreddits, and although there is lacking context (a limitation of such analyses), it would perhaps be indicative of a sense of encouragement in both spaces/communities. Similarly to the comments of mental health subreddits, we see frequent occurrences of anxiety, dont, feel, people, like, and help. However, we also see words that were not apparent at all in the frequent words of the mental health subreddits including healing, trauma, health, therapy, love. These words are suggestive of the kind of inspirational community such influencers may be trying to cultivate and the motivational approach to mental health that advocates for treatment, healing, and connection. Thus, there may be differential value in the community support that mental health influencers offer compared to Reddit that uniquely emphasizes outlets for treatment and recovery and a positive community to stimulate healthy coping.

Examining the topics for captions, I judged the caption topics to be related to 1) human connection, 2) desire and action, 3)grounded perspectives, and 4) coping/recovery. We can observe that these topics are not actually that far off from those identified for mental health subreddit comments. However, they appear to be more focalized and capture more relevant language.

Sentiment

I next wanted to look at the sentiment of Instagram mental health influencer captions and compare the distributions to what I found for Reddit mental health subreddit posts and comments. I hypothesized that the captions would have more positive sentiment, reflecting the positivity within this community and focus on promoting growth, recovery, and healing. I thought that the most meaningful comparison would be between Instagram captions and Reddit comments since both, whether it be from influencers or simply members of the mental health communities on Reddit, are intended to serve as support and encouragement for those who are perhaps struggling. However, I wanted to also take a look at any difference in sentiment between Reddit posts in mental health subreddits and these Instagram captions. Is there a difference in the sentiment of these Reddit posts and Instagram captions? And what can the sentiment comparison of Reddit comments and Instagram captions tell us about these different sources of support on social media?

Comparison with Reddit Posts

Computing and plotting the afinn sentiment distributions for captions and Reddit posts and running a one-sided two-sample t-test, I found the sentiment of Instagram captions is statistically significantly more positive than the sentiment of the Reddit posts.

It is certainly not surprising that the average sentiment of Reddit posts is negative, while that of Instagram captions is positive, as people posting in these Reddit communities are expressing their pain and/or reaching out for support, whereas Instagram captions of mental health influencers likely serve more of a role of inspiring, supporting, and engaging with followers. However, this comparison provides greater context to understand how the responses to such posts, i.e., comments, compare to Instagram captions. How are comments responding to this negativity, and is the sentiment different from the support influencers provide?

Comparison with Reddit Comments

I first used the afinn lexicon once again to understand the numeric distribution of sentiment for Reddit comments vs. Instagram captions. From the below summary statistics, we can see that two measures of central tendency (i.e., the median and the mean) suggest different results about whether the sample of Reddit comments is more positive or more negative in sentiment compared to Instagram captions.

+--------+----------+---------+--------+-------+---------+---------+
|  Data  |   Min.   | 1st Qu. | Median | Mean  | 3rd Qu. |  Max.   |
+--------+----------+---------+--------+-------+---------+---------+
| Reddit | -278.000 |  -2.000 |  2.000 | 4.351 |   6.000 | 125.000 |
| Insta  |  -59.000 |  -2.000 |  3.000 | 3.574 |  10.000 |  63.000 |
+--------+----------+---------+--------+-------+---------+---------+

As one would expect, when I ran a two-sample t-test (similarly to Reddit posts vs. Instagram captions above), I found that there is no evidence that the average sentiment of Reddit comments is statistically significantly higher than the average sentiment of Instagram captions.

I actually found this to be quite an encouraging result. A two-sided t-test would similarly suggest that there is no evidence to conclude that there is any difference in the sentiment between Reddit comments and Instagram captions. This is not in line with my original hypothesis, as I actually expected the Instagram captions to be significantly more positive than the Reddit comments, but the lack of evidence for a difference perhaps can suggest, at least based on this sample of Reddit and Instagram data, the potential value of the Reddit community for support that has a comparable level of positivity to mental health influencers. Given the difference in the medium of support, as one is an open forum for anyone’s contributions and the other is the more distant connection with an influencer, these results can perhaps highlight that positive mental health support is not limited to one platform.

To further explore this comparison in sentiment, I used the NRC lexicon, computing and visualizing the distribution of emotions for both texts.

Overall, we can see that the composition of emotions is quite similar for Instagram captions and Reddit comments. However, we can see for example that there is slightly more disgust in Reddit comments and a bit more joy in Instagram captions.

Exploring Captions of Most Liked Instagram Posts

Additionally scraping the number of likes that each post received on Instagram, I wanted to see what language was used in the captions that were the most liked for each influencer. I also wanted to see if there was any significant difference in the sentiment of these captions compared to captions for posts that were not the most liked for any given influencer. I grouped the data by influencer and determined the caption corresponding to the most liked post of that influencer. I approached the question this way because while all influencers in the data have 30k+ followers, their number of followers vary widely so it would not be meaningful to compare the captions for the most-liked posts at an aggregate level.

Wordcloud for most-liked posts for the 35 influencers.

We can see that while can still is one of the most frequent words, I found it quite interesting to see a greater presence of time. After all, the regression for Reddit comments suggested that comments that were most closely related to the topic of the orientation of time were associated with a higher comment score. Thus, this may suggest that the consideration of time is important in effectively connecting with people when it comes to mental health issues.

We can also observe a greater presence of words like good, share, need, love, selflove, self, and body. Thus, it appears that the messages that most resonate with people in regard to mental health have a deep sense of empathy and encouragement of connection with and care for one’s self, body, mind. And perhaps the recognition of need validates people’s desire to feel okay and perhaps suggests more direction. It is interesting that need is a word that fell into the second topic from the LDA output for Reddit comments, which was revealed in the regression model for comments to be negatively associated with comment score. Thus, perhaps Instagram is a better platform for mental health when seeking perspective and direction, while Reddit is valued more as a means of obtaining community validation.

Beyond exploring the words at face value, I also wanted to explore the sentiment of captions for the most-liked influencer posts compared to those that were not the most liked.

As highlighted by the red dashed lines in the visualized distributions above, the mean afinn sentiment for the collection of captions of most liked posts is 8.37 while the mean sentiment for the captions of posts that are not most liked is 3.25.

Conducting a one-sided two-sample t-test, it is evident that the sentiment of captions for the most-liked posts is statistically significantly more positive than the sentiment of captions for posts that are not most liked. Thus, it seems that while the Reddit community does not preferentially value comments that are more positive, the Instagram mental health community values more positivity. Thus, even though the sentiment does not differ significantly between Reddit comments and Instagram captions, there seems to be greater value placed on positivity on Instagram in the mental health space than on Reddit. This may suggest that Instagram is a better place to be if you value a community that embraces positivity when discussing mental health issues. Also, this greater support for positivity on Instagram than Reddit may relate to the credibility of these influencers and the differential power of their positive words compared to members of mental health subreddits.

The Use of the UPenn Subreddit for Mental Health Concerns

While this was not central to my questions of interest in exploring data of mental health subreddits, as a student at Penn (and member of the UPenn subreddit community), I wanted to explore what proportion of posts on the subreddit relate to mental health. Given the restricted IP access to dictionaries of psychological terms, I used a somewhat crude form of string detection to indicate whether posts related to mental health (based on whether they contained any of a series of strings, including but not limited to “depress”, “suicid”, “anxiety”, “anxious”, “CAPS”, “therapy”). I was somewhat surprised to find that a very small percentage of posts (about 7.14%) contained content related to general mental health concerns. While this is in part due to the rough means of detection, it could also suggest that Penn students tend to either seek out support from people close to them in their lives such as friends and family or tend to keep silent about mental health concerns, not seeking out support on platforms like Reddit.

I also briefly explored if the post scores for posts related to mental health tend to be more socially supported (using the post score as a measure of this) than posts that are not related to mental health in the UPenn subreddit.

+--------+------+---------+--------+-------+---------+--------+
|  Post  | Min. | 1st Qu. | Median | Mean  | 3rd Qu. |  Max.  |
+--------+------+---------+--------+-------+---------+--------+
| MH     | 2.00 |   16.50 |  30.00 | 47.64 |   84.50 | 129.00 |
| NOT MH | 0.00 |    4.00 |   7.00 | 16.58 |   14.50 | 204.00 |
+--------+------+---------+--------+-------+---------+--------+

We can see that the average post score is higher for posts that are related to mental health. Thus, it seems that people in the Penn community on Reddit particularly support concerns about mental health. This suggests that this Reddit community is a space that does not minimize mental health issues and instead considers them important and worthy of social support.

Conclusions

The following conclusions can be summarized from this data exploration of the mental health communities on Reddit and the textual support of Instagram mental health influencers:

Comments in mental health subreddits are more positive in sentiment than posts, reflecting the positive feedback Redditers provide in regards to mental health issues.
People who are seeking more emotionally positive support on Reddit should perhaps look to such subreddits like /r/depression_help, /r/Anxietyhelp, etc., while those seeking support more neutral in nature should look to the main subreddits like /r/depression, /r/anxiety, etc.
Crisis hotline conversations are associated with more words related to outlets for connection and sharing than Reddit mental health posts. Thus, perhaps this is reinforcement for the value of crisis hotline services for bringing resources to light in the words of individuals that text in.
There is no evidence that sentiment is predictive of the social support a post receives nor the social support a comment receives.
The length of a comment in mental health subreddits is associated with a higher comment score, suggesting the value inherent in thoughtful contributions to conversations about personal mental health issues.
Perspective and direction are not as valued in comments on Reddit, suggesting that Reddit is a place to seek mental health support when validation and emotional connection are the priority.
The context of time is an important consideration we should be cognizant of in conversations about mental health, impacting everyone and shaping the emotional living experience.
Instagram mental health influencer captions are more positive than Reddit posts, but there is no evidence that they are more positive than Reddit comments. Thus, there is positivity to be found in the words of social support on both platforms. However, positivity in the words of influencers is valued more (in terms of likes) than in the words of commenters on Reddit (in terms of comment score).
1 in 14 posts in the /r/UPenn subreddit is related to mental health concerns. Posts related to mental health receive more social support, suggesting the consideration and care this community has for these personal challenges.

Further Analysis

Given the many insignificant predictors of social support I discovered in the regression models, particularly in explaining measures of the social support posts receive, it may be worthwhile to explore Reddit posts and comments separately (as I did here) but with a more complex psychology-based lexicon that can tap into the nuances of language in the context of psychology. Unsupervised machine learning could provide a mechanism to better understand the patterns in the post and comment text data and one could further attempt to interpret such patterns to come up with other novel predictor variables to explore in a model.

About Me

I am Emma O’Neil, a junior pursuing a dual degree at The Wharton School and the College of Arts and Sciences at the University of Pennsylvania, studying statistics, computer science, and cognitive science. I am passionate about the intersection of technology, design, and entrepreneurship (as well as mental health!). You can find more about me here.