How mentions are counted: Posts are scraped from 19 subreddits: r/birthcontrol (primary, 200 posts), plus r/contraception, r/TwoXChromosomes, r/AskWomen, r/AskDocs, r/PCOS, r/TryingForABaby, r/BabyBumps, r/childfree, r/sex (200 each), and r/WomensHealth, r/obgyn, r/Healthyhooha, r/Periods, r/Endo, r/abortion, r/prochoice, r/prolife, r/FAMnNFP (100 each). Both /new and /hot sort orders are scraped per sub for broader coverage. Each post title and body text is scanned against 45 regex patterns — one per contraceptive type, including common misspellings and slang (e.g., "merina" → Mirena, "paraguard" → Paragard, "copper T" → Paragard, "bc shot" → Depo-Provera). Mentions are counted at the post level only — a post is counted for a contraceptive type if the post's own title or body text matches the regex. Comment text is analyzed separately for side effects and sentiment but does not add to the mentions table. The "IUD (general)" category uses a negative lookahead to avoid double-counting when a specific brand is also named. Cross-posted content is detected and deduplicated — a cross-post's mentions are not counted separately from its parent. Counts reflect the number of unique posts mentioning each type, not total word occurrences. A post that discusses multiple methods (e.g., "switching from Mirena to Nexplanon") is counted once for each type it mentions.
Sentiment by Type
Keyword-based sentiment scoring: Each post and comment (across all 19 subreddits) is scored from -1.0 (very negative) to +1.0 (very positive) using curated word lists:
~40 positive words (love, recommend, effective, relief, helped, works, etc.)
~50 negative words (hate, pain, nightmare, frustrated, scared, bleeding, etc.)
Negators (not, never, don't, etc.) flip the next word's polarity — "not painful" counts as positive
Intensifiers (very, extremely, so) multiply the next word's weight by 1.5x
Score formula:(positive - negative) / total_sentiment_words, clamped to [-1, 1]. The chart shows the average score across all posts mentioning each type. Posts with no sentiment words score NULL and are excluded — this means the post count here may be lower than in the Mentions chart (e.g., a title-only post like "Copper IUD and Slynd at the same time" contains no positive/negative words and is omitted). Use the subreddit filter to compare sentiment across communities. This is best for comparing trends between types, not interpreting individual posts — it cannot detect sarcasm, context, or complex phrasing.
Mentions Over Time
Daily mention counts: Shows the top 5 most-mentioned contraceptive types over time, aggregated across all 19 subreddits (or filtered by one). Each data point is the number of unique posts from that day containing a regex match for that type. Days with zero posts are not shown. The time and subreddit filters control the data displayed.
Side Effect Heatmap
Contraceptive × side-effect matrix: Data is sourced from 19 subreddits. Each cell shows the number of posts and comments that mention both a contraceptive type and a side effect. Side effects are detected using 24 regex patterns matching symptoms like "bleeding," "cramping," "weight gain," "anxiety," etc. Color intensity scales linearly from transparent (0) to red (max value in the table). Cross-posted content is deduplicated. Limitation: A post saying "I'm worried about weight gain" and one saying "I had no weight gain" both count — the regex detects the mention, not the context. Top 12 contraceptives and top 15 side effects are shown.
Top Side Effects / Worries
Ranked side-effect mentions: Counts the number of unique posts and comments (across all 19 subreddits) that match each of the 24 side-effect regex patterns. A single post mentioning "cramps" and "bleeding" counts once for each category. Patterns include variations (e.g., "headache" and "migraine" both map to "Headaches"). This tracks what people are talking about, not necessarily what they experienced — questions, fears, and reports all count equally.
Health Literacy Gaps
Health literacy gap detection: Posts and comments are scanned against 20 proximity-bounded regex patterns sourced from WHO, ACOG, CDC, and Planned Parenthood myth lists, in 3 categories: Safety (IUDs cause abortions, BC causes infertility, Plan B is abortion, BC causes cancer, hormonal BC permanently changes you, IUD gets lost in body, need kids before IUD, copper IUD is toxic), Efficacy (pulling out equally effective, condoms don't work, can't get pregnant on period/breastfeeding/first time, natural methods equally effective, douching prevents pregnancy, two condoms are safer, antibiotics make all BC fail, spermicide alone is reliable), Mechanism (need breaks from pill, must take pill at exact same time). Each pattern requires co-occurrence of two concepts within a bounded character window (typically 30–80 characters) to prevent false positives from distant, unrelated mentions. Patterns are tuned for specificity: e.g., “Plan B is abortion” requires equating language (not just co-mention), “IUD gets lost in body” requires migration verbs (not the word “lost” alone). Stance detection: Each match is classified by analyzing a 150-character context window and sentence boundaries around the regex match for linguistic cues: ■asserting (definitive language: “trust me,” “causes,” “obviously”), ■questioning (uncertainty: “is it true,” “I heard,” question marks), ■debunking (corrective: “myth,” “not true,” “misconception,” “no evidence”), ■unclear (insufficient context). Bar chart segments are colored by stance when available. Sources: WHO Family Planning Handbook, ACOG Practice Bulletins, CDC MMWR, Planned Parenthood.
Literacy Gaps × Contraceptive Heatmap
Contraceptive × literacy gaps matrix: Each cell shows the number of posts and comments that mention both a contraceptive type and a common misconception or knowledge gap. Color intensity scales linearly from transparent (0) to orange (max value). Top contraceptives and all claims with data are shown.
Questions Being Asked
Question detection: Post titles are scanned for question indicators (contains "?" or starts with question words like "has anyone", "is it", "does", "how", "what", etc.). Detected questions are categorized into 6 types by matching the title + body against keyword patterns: Side effects/Safety (side effects, safety, specific symptoms), Effectiveness (effectiveness, pregnancy risk, failure rates), Usage (how to use, missed doses, insertion/removal), Access/Cost (cost, insurance, prescriptions, clinics), Experience (personal experiences, recommendations, advice), Switching (changing methods, alternatives). A question can belong to multiple categories. Limitation: Only post titles are checked for question detection; body text is used for categorization only.
Questions × Contraceptive Heatmap
Contraceptive × question category matrix: Each cell shows the number of posts that are questions mentioning both a contraceptive type and a question category. Color intensity scales linearly from transparent (0) to teal (max value). Top contraceptives and all question categories are shown.
Reproductive Health Topics
Broader health context: Posts are scanned for 12 reproductive health themes discussed alongside contraception: Pregnancy/Fertility, Menstrual Health, STI/STD, Mental Health, PCOS, Endometriosis, Sexual Health, Weight/Body, Hormonal Balance, Postpartum, Perimenopause, Abortion. A post can match multiple topics. Detection uses keyword regex matching on post title + body text.
Topics × Contraceptive Heatmap
Contraceptive × health topic matrix: Each cell shows the number of posts that mention both a contraceptive type and a reproductive health topic. Color intensity scales linearly from transparent (0) to green (max value). Top contraceptives and all topics with data are shown.
User Demographics
Self-reported age, gender, and location extraction: Reddit users commonly self-report demographics in posts (e.g., "I'm 23F", "(25F)", "26 year old woman"). The system scans the first 500 characters of each post through 7 ordered regex patterns (most specific first): 1. "I'm 23F" / "I am 23F", 2. "(23F)" / "[23F]", 3. "23F here", 4. "23 year old female/woman/man", 5. "I'm a 23 year old" (age only), 6. "I'm female" / "I'm a woman" (gender only), 7. "I'm 23 years old" (age only). A third-person filter skips matches preceded by "my girlfriend", "my partner", etc. within 30 characters. Age range: 13–65. Gender maps to female/male/null. Location: Separate regex patterns detect US (state names, "the US/USA", state abbreviations) vs. non-US (country names, nationalities, "my GP/NHS") from first-person statements. Coverage is lower than age/gender since most posts don't mention location. Posts only (not comments). Limitation: Non-binary genders map to null.
Age × Contraceptive Heatmap
Age band × contraceptive matrix: Each cell shows the number of posts that mention a contraceptive type and where the poster self-reported an age in that band. Color intensity scales linearly from transparent (0) to purple (max value). 7 age bands: 13–17, 18–22, 23–27, 28–32, 33–37, 38–42, 43+.
Category Breakdown
Grouped contraceptive categories: Individual types are grouped into 8 categories: IUDs (Mirena, Kyleena, Liletta, Skyla, Paragard, IUD general), Pills (Combined, Mini, general, Slynd, Opill, Yaz, Lo Loestrin, Ortho Tri-Cyclen, Junel, Seasonique, Sprintec, Alesse/Aviane, Nordette/Portia, Apri/Desogen, Nextstellis, Natazia), Long-acting (Nexplanon, Depo-Provera), Barrier/Other (Condoms, NuvaRing, Xulane patch, Spermicide, Diaphragm, Phexxi), Emergency (Plan B), FAM/NFP (Billings, BBT, Symptothermal, Standard Days, Calendar/Rhythm, LAM, FAM/NFP general), FAM/NFP (app-based) (Natural Cycles, Kindara, Read Your Body, Tempdrop, Daysy, Ava bracelet, Oura, Femometer), Withdrawal. The chart sums mention counts within each group.
Weights discussion (comments) 1.5× higher than upvotes. A post with 10 upvotes and 20 comments scores higher than one with 50 upvotes and 2 comments. Shows the top 30 most-engaged posts with their contraceptive types, upvote count, and comment count.
Engagement × Contraceptive Type
Post Explorer
Browse raw posts and comments: Select a contraceptive type to see the top posts (sorted by engagement score) whose own title or body text mentions it. Posts are sourced from all 19 subreddits. Each post shows:
Subreddit badge — cyan pill showing which subreddit the post came from
Sentiment badge — green (+), red (-), or gray (neutral/none), based on the keyword scorer
Side-effect pills — yellow tags for each side effect detected in that post's text
View comments — expands to show scraped Reddit comments with their own sentiment scores
Post text is the original Reddit selftext (body). Comments are fetched up to 200 per post, walking the full reply tree. Only posts with contraceptive mentions in their own text have their comments scraped. Cross-posts are stored but their mentions are not double-counted. Comments are analyzed for side effects and sentiment but do not affect the post's mention attribution.
Select a type and click Load Posts
Scrape Log
Loading...
Human Validation of Health Literacy Gap Detection
Rate posts for common health misconceptions and knowledge gaps. Your votes are compared against the system's detection to compute precision, recall, and agreement metrics.
Use "Unsure" when a post is ambiguous or lacks context.
0/0 voted
Literacy Gap Validation Statistics
Submit votes to see statistics
Human Validation of Side Effect Detection
Review comments and select which side effects are mentioned. Use "Not Relevant" if the post isn't about contraceptive side effects, or "Unsure" if you need more context.
0/0 voted
Side Effect Validation Statistics
Submit votes to see statistics
Human Validation of Sentiment Analysis
Read each comment and rate its overall sentiment on a 5-point scale. Use "Not Relevant" for posts that aren't about contraception or lack context. Trigger words are highlighted to show what the system picked up on.
0/0 voted
Sentiment Validation Statistics
Submit votes to see statistics
Suggestion Box
Suggest features or improvements. Upvote ideas you like — the most popular rise to the top.