Raw Reddit posts and comments saved in SQLite.
Data Status
Current generated dataset.
GitHub Actions crawls Reddit candidates, parses pending items, rescreens flagged reports, rebuilds the static JSON bundles, and publishes the Pages site. Counts below reflect the SQLite database at build time, including the backlog still waiting for OpenAI extraction.
Candidates already read by the extraction model.
Candidates waiting for the next parse job.
Rows that need retry or manual inspection.
Canonical drug reports pulled from parsed items.
Reports that pass duration and attribution filters.
Original Reddit posts.
Reddit comments matching the search terms.
The queue is intentionally visible. A candidate enters the database when the crawler finds a Reddit post or comment matching a drug name, brand name, or shorthand term. It becomes parsed only after the one-item LLM pass has read it and written structured reports. A single candidate can mention more than one drug family, so family-level counts do not have to add up to the downloaded total.
At build time, 8,056 candidates are still waiting for parsing. 0 parsed posts are waiting for a stronger rescreen, and 59 reports with side-effect phrases are waiting for the side-effect severity pass.
Drug-family parse queue
These counts come from matched Reddit search terms. One post can match more than one family.
| Matched family | Downloaded | Parsed | Pending | Errors | Reports found | Plottable |
|---|---|---|---|---|---|---|
| Retatrutide | 5,557 | 4,307 | 1,250 | 0 | 3,878 | 71 |
| Tirzepatide | 14,065 | 6,685 | 7,380 | 0 | 5,698 | 214 |
| Semaglutide | 9,203 | 5,326 | 3,877 | 0 | 4,166 | 152 |
Downloaded Reddit sources
Raw candidates by subreddit, split into submissions and comments, with parse queue state.
| Subreddit | Downloaded | Posts | Comments | Parsed | Pending | Errors | Oldest | Newest |
|---|---|---|---|---|---|---|---|---|
| Semaglutide | 9,865 | 3,522 | 6,343 | 4,956 | 4,909 | 0 | 2022-04-17 | 2025-05-19 |
| Mounjaro | 6,640 | 1,952 | 4,688 | 3,493 | 3,147 | 0 | 2022-05-18 | 2025-05-19 |
| Retatrutide | 4,184 | 2,174 | 2,010 | 4,184 | 0 | 0 | 2023-06-18 | 2025-05-19 |
| MounjaroMaintenance | 273 | 78 | 195 | 273 | 0 | 0 | 2023-05-16 | 2025-05-16 |
Weight-change plot summary
| Drug family | Parsed posts | Plottable reports | Median duration | Median change |
|---|---|---|---|---|
| Retatrutide | 3,878 | 71 | 8.7 weeks | -12.0 kg |
| Tirzepatide | 5,698 | 214 | 26.1 weeks | -19.1 kg |
| Semaglutide | 4,166 | 152 | 26.1 weeks | -14.8 kg |
Crawler page progress
Each search combination tracks how many source pages have been fetched, whether it is exhausted, and whether recent errors were recorded.
| Subreddit | Type | Window | Searches | Pages fetched | Exhausted | Recent errors | Updated |
|---|---|---|---|---|---|---|---|
| Mounjaro | comment | historical | 7 | 56 | 0 | 1 | 2026-07-04T14:17:01+00:00 |
| Mounjaro | submission | historical | 7 | 35 | 0 | 1 | 2026-07-04T14:24:59+00:00 |
| MounjaroMaintenance | comment | historical | 2 | 3 | 0 | 1 | 2026-07-01T22:54:09+00:00 |
| MounjaroMaintenance | submission | historical | 2 | 4 | 0 | 0 | 2026-07-01T22:46:07+00:00 |
| Retatrutide | comment | historical | 3 | 21 | 0 | 1 | 2026-07-04T04:14:44+00:00 |
| Retatrutide | submission | historical | 3 | 26 | 0 | 0 | 2026-07-04T04:05:39+00:00 |
| Retatrutide | comment | recent | 48 | 48 | 48 | 0 | 2026-07-05T07:02:06+00:00 |
| Retatrutide | submission | recent | 48 | 48 | 48 | 0 | 2026-07-05T07:01:58+00:00 |
| RetatrutideTrial | submission | historical | 1 | 0 | 0 | 1 | 2026-07-02T14:53:21+00:00 |
| RetatrutideTrial | comment | recent | 17 | 16 | 16 | 1 | 2026-07-05T07:08:02+00:00 |
| RetatrutideTrial | submission | recent | 20 | 17 | 17 | 3 | 2026-07-05T07:16:03+00:00 |
| Semaglutide | comment | historical | 9 | 79 | 1 | 1 | 2026-07-05T04:55:48+00:00 |
| Semaglutide | submission | historical | 10 | 58 | 1 | 1 | 2026-07-05T04:47:32+00:00 |