Contraceptive Reddit Tracker

Mentions by Type

How mentions are counted: Posts are scraped from 18 subreddits: r/birthcontrol (primary, 200 posts), plus r/TwoXChromosomes, r/AskWomen, r/AskDocs, r/PCOS, r/TryingForABaby, r/BabyBumps, r/childfree, r/sex (200 each), and r/WomensHealth, r/obgyn, r/Healthyhooha, r/Periods, r/Endo, r/abortion, r/prochoice, r/prolife, r/FAMnNFP (100 each). Both /new and /hot sort orders are scraped per sub for broader coverage. Each post title and body text is scanned against 45 regex patterns — one per contraceptive type, including common misspellings and slang (e.g., "merina" → Mirena, "paraguard" → Paragard, "copper T" → Paragard, "bc shot" → Depo-Provera). Mentions are counted at the post level only — a post is counted for a contraceptive type if the post's own title or body text matches the regex. Comment text is analyzed separately for side effects and sentiment but does not add to the mentions table. The "IUD (general)" category uses a negative lookahead to avoid double-counting when a specific brand is also named. Cross-posted content is detected and deduplicated — a cross-post's mentions are not counted separately from its parent. Counts reflect the number of unique posts mentioning each type, not total word occurrences. A post that discusses multiple methods (e.g., "switching from Mirena to Nexplanon") is counted once for each type it mentions.

Sentiment by Type

Keyword-based sentiment scoring: Each post and comment (across all 18 subreddits) is scored from -1.0 (very negative) to +1.0 (very positive) using curated word lists:

~40 positive words (love, recommend, effective, relief, helped, works, etc.)
~50 negative words (hate, pain, nightmare, frustrated, scared, bleeding, etc.)
Negators (not, never, don't, etc.) flip the next word's polarity — "not painful" counts as positive
Intensifiers (very, extremely, so) multiply the next word's weight by 1.5x

Score formula: (positive - negative) / total_sentiment_words, clamped to [-1, 1]. The chart shows the average score across all posts mentioning each type. Posts with no sentiment words score NULL and are excluded — this means the post count here may be lower than in the Mentions chart (e.g., a title-only post like "Copper IUD and Slynd at the same time" contains no positive/negative words and is omitted). Use the subreddit filter to compare sentiment across communities. This is best for comparing trends between types, not interpreting individual posts — it cannot detect sarcasm, context, or complex phrasing.

Mentions Over Time

Daily mention counts: Shows the top 5 most-mentioned contraceptive types over time, aggregated across all 18 subreddits (or filtered by one). Each data point is the number of unique posts from that day containing a regex match for that type. Days with zero posts are not shown. The time and subreddit filters control the data displayed.

Side Effect Heatmap

Contraceptive × side-effect matrix: Data is sourced from 18 subreddits. Each cell shows the number of posts and comments that mention both a contraceptive type and a side effect. Side effects are detected using 24 regex patterns matching symptoms like "bleeding," "cramping," "weight gain," "anxiety," etc. Color intensity scales linearly from transparent (0) to red (max value in the table). Cross-posted content is deduplicated. Limitation: A post saying "I'm worried about weight gain" and one saying "I had no weight gain" both count — the regex detects the mention, not the context. Top 12 contraceptives and top 15 side effects are shown.

Top Side Effects / Worries

Ranked side-effect mentions: Counts the number of unique posts and comments (across all 18 subreddits) that match each of the 24 side-effect regex patterns. A single post mentioning "cramps" and "bleeding" counts once for each category. Patterns include variations (e.g., "headache" and "migraine" both map to "Headaches"). This tracks what people are talking about, not necessarily what they experienced — questions, fears, and reports all count equally.

Health Literacy Gaps

Health literacy gap detection: Posts and comments are scanned against 20 proximity-bounded regex patterns sourced from WHO, ACOG, CDC, and Planned Parenthood myth lists, in 3 categories: Safety (IUDs cause abortions, BC causes infertility, Plan B is abortion, BC causes cancer, hormonal BC permanently changes you, IUD gets lost in body, need kids before IUD, copper IUD is toxic), Efficacy (pulling out equally effective, condoms don't work, can't get pregnant on period/breastfeeding/first time, natural methods equally effective, douching prevents pregnancy, two condoms are safer, antibiotics make all BC fail, spermicide alone is reliable), Mechanism (need breaks from pill, must take pill at exact same time). Each pattern requires co-occurrence of two concepts within a bounded character window (typically 30–80 characters) to prevent false positives from distant, unrelated mentions. Patterns are tuned for specificity: e.g., “Plan B is abortion” requires equating language (not just co-mention), “IUD gets lost in body” requires migration verbs (not the word “lost” alone). Stance detection: Each match is classified by analyzing a 150-character context window and sentence boundaries around the regex match for linguistic cues: ■ asserting (definitive language: “trust me,” “causes,” “obviously”), ■ questioning (uncertainty: “is it true,” “I heard,” question marks), ■ debunking (corrective: “myth,” “not true,” “misconception,” “no evidence”), ■ unclear (insufficient context). Bar chart segments are colored by stance when available. Sources: WHO Family Planning Handbook, ACOG Practice Bulletins, CDC MMWR, Planned Parenthood.

Questions Being Asked

Question detection: Post titles are scanned for question indicators (contains "?" or starts with question words like "has anyone", "is it", "does", "how", "what", etc.). Detected questions are categorized into 6 types by matching the title + body against keyword patterns: Side effects/Safety (side effects, safety, specific symptoms), Effectiveness (effectiveness, pregnancy risk, failure rates), Usage (how to use, missed doses, insertion/removal), Access/Cost (cost, insurance, prescriptions, clinics), Experience (personal experiences, recommendations, advice), Switching (changing methods, alternatives). A question can belong to multiple categories. Limitation: Only post titles are checked for question detection; body text is used for categorization only.

Questions × Contraceptive Heatmap

Contraceptive × question category matrix: Each cell shows the number of posts that are questions mentioning both a contraceptive type and a question category. Color intensity scales linearly from transparent (0) to teal (max value). Top contraceptives and all question categories are shown.

Reproductive Health Topics

Broader health context: Posts are scanned for 12 reproductive health themes discussed alongside contraception: Pregnancy/Fertility, Menstrual Health, STI/STD, Mental Health, PCOS, Endometriosis, Sexual Health, Weight/Body, Hormonal Balance, Postpartum, Perimenopause, Abortion. A post can match multiple topics. Detection uses keyword regex matching on post title + body text.

Topics × Contraceptive Heatmap

Contraceptive × health topic matrix: Each cell shows the number of posts that mention both a contraceptive type and a reproductive health topic. Color intensity scales linearly from transparent (0) to green (max value). Top contraceptives and all topics with data are shown.

User Demographics

Self-reported age, gender, and location extraction: Reddit users commonly self-report demographics in posts (e.g., "I'm 23F", "(25F)", "26 year old woman"). The system scans the first 500 characters of each post through 7 ordered regex patterns (most specific first): 1. "I'm 23F" / "I am 23F", 2. "(23F)" / "[23F]", 3. "23F here", 4. "23 year old female/woman/man", 5. "I'm a 23 year old" (age only), 6. "I'm female" / "I'm a woman" (gender only), 7. "I'm 23 years old" (age only). A third-person filter skips matches preceded by "my girlfriend", "my partner", etc. within 30 characters. Age range: 13–65. Gender maps to female/male/null. Location: Separate regex patterns detect US (state names, "the US/USA", state abbreviations) vs. non-US (country names, nationalities, "my GP/NHS") from first-person statements. Coverage is lower than age/gender since most posts don't mention location. Posts only (not comments). Limitation: Non-binary genders map to null.

Age × Contraceptive Heatmap

Age band × contraceptive matrix: Each cell shows the number of posts that mention a contraceptive type and where the poster self-reported an age in that band. Color intensity scales linearly from transparent (0) to purple (max value). 7 age bands: 13–17, 18–22, 23–27, 28–32, 33–37, 38–42, 43+.

Category Breakdown

Grouped contraceptive categories: Individual types are grouped into 8 categories: IUDs (Mirena, Kyleena, Liletta, Skyla, Paragard, IUD general), Pills (Combined, Mini, general, Slynd, Opill, Yaz, Lo Loestrin, Ortho Tri-Cyclen, Junel, Seasonique, Sprintec, Alesse/Aviane, Nordette/Portia, Apri/Desogen, Nextstellis, Natazia), Long-acting (Nexplanon, Depo-Provera), Barrier/Other (Condoms, NuvaRing, Xulane patch, Spermicide, Diaphragm, Phexxi), Emergency (Plan B), FAM/NFP (Billings, BBT, Symptothermal, Standard Days, Calendar/Rhythm, LAM, FAM/NFP general), FAM/NFP (app-based) (Natural Cycles, Kindara, Read Your Body, Tempdrop, Daysy, Ava bracelet, Oura, Femometer), Withdrawal. The chart sums mention counts within each group.

Post Engagement

Engagement score formula: log2(upvotes) + log2(comments) × 1.5

Weights discussion (comments) 1.5× higher than upvotes. A post with 10 upvotes and 20 comments scores higher than one with 50 upvotes and 2 comments. Shows the top 30 most-engaged posts with their contraceptive types, upvote count, and comment count.

Engagement × Contraceptive Type

Post Explorer

Browse raw posts and comments: Select a contraceptive type to see the top posts (sorted by engagement score) whose own title or body text mentions it. Posts are sourced from all 18 subreddits. Each post shows:

Subreddit badge — cyan pill showing which subreddit the post came from
Sentiment badge — green (+), red (-), or gray (neutral/none), based on the keyword scorer
Engagement score — composite metric: log2(upvotes) + log2(comments) × 1.5, weighting discussion higher than votes
Side-effect pills — yellow tags for each side effect detected in that post's text
View comments — expands to show scraped Reddit comments with their own sentiment scores

Post text is the original Reddit selftext (body). Comments are fetched up to 200 per post, walking the full reply tree. Only posts with contraceptive mentions in their own text have their comments scraped. Cross-posts are stored but their mentions are not double-counted. Comments are analyzed for side effects and sentiment but do not affect the post's mention attribution.

Select a type and click Load Posts

Contraceptive — Reddit Tracker

Mentions by Type

Sentiment by Type

Mentions Over Time

Side Effect Heatmap

Top Side Effects / Worries

Health Literacy Gaps

Literacy Gaps × Contraceptive Heatmap

Questions Being Asked

Questions × Contraceptive Heatmap

Reproductive Health Topics

Topics × Contraceptive Heatmap

User Demographics

Age × Contraceptive Heatmap

Category Breakdown

Post Engagement

Engagement × Contraceptive Type

Post Explorer

Scrape Log

Human Validation of Health Literacy Gap Detection

Literacy Gap Validation Statistics

Human Validation of Side Effect Detection

Side Effect Validation Statistics

Human Validation of Sentiment Analysis

Sentiment Validation Statistics

Suggestion Box