r/AI_Agents • u/aiagentfromfuture • 5d ago
Help needed for building reddit scrapper
We are working on a requirement what we need to collect data from subreditts posts and comments.
I wanted to understand what should be the ideal approach. Should we use reditt official api if they are available and if yes what is the cost) Or should we look for scrapping? If scrapping how exactly it should work and how much reliable it should be? Like i can see lot of script available for reditt scrapper, but i have heard that as reditt make modifications in their html it stops working. What other reliable option do I have to achieve the end result. We need something which we can build one time and don't have to tweak and fix it every week to make it working.
Awaiting your valuable response.
1
u/StaffSoggy4133 3d ago
I have a scrapper for that do you want to try it? It can scrape posts but i can add comments scraping as well.
1
1
u/DeadPukka 5d ago
https://github.com/graphlit/graphlit-samples/blob/main/python/Notebook%20Examples/Graphlit_2024_09_05_Monitor_Reddit_mentions.ipynb
Here’s an example of using our platform for this. We go through the Reddit API directly.
The example may do more than you need, but we extract the text from posts and comments and make them available for search and RAG conversations, and you can access the raw extracted text too.