Reddit Scraper Tool
Learn about the Reddit Scraper Tool used by the Newsletter AI Agent
Reddit Scraper Tool
The Reddit Scraper Tool is a custom tool that allows the Newsletter AI Agent to gather discussions from Reddit using an Apify actor. It provides a way to collect community insights and discussions related to the specified topic.
Overview
The Reddit Scraper Tool is primarily used by the Researcher Agent to gather community discussions about the specified topic. It provides a flexible interface for searching Reddit and extracting structured data from posts and comments.
Implementation
The Reddit Scraper Tool is implemented as a CrewAI BaseTool
that interacts with an Apify Reddit scraper actor. Here’s the implementation:
Parameters
The Reddit Scraper Tool accepts the following parameters:
Parameter | Type | Description | Default |
---|---|---|---|
searches | List[str] | Search queries for Reddit topics | Required |
startUrls | List[str] | Direct URLs to Reddit pages to scrape | None |
skipComments | bool | Skip scraping comments when processing posts | False |
skipUserPosts | bool | Skip scraping user posts when processing user activity | False |
skipCommunity | bool | Skip scraping community info but still get community posts | False |
searchPosts | bool | Search for posts with the provided search | True |
searchComments | bool | Search for comments with the provided search | False |
searchCommunities | bool | Search for communities with the provided search | False |
searchUsers | bool | Search for users with the provided search | False |
sort | str | How to sort the results (e.g., “new”, “top”, “hot”) | “new” |
time | str | Time filter for results | None |
includeNSFW | bool | Include NSFW content in results | True |
maxPostCount | int | Maximum number of posts to retrieve | 20 |
maxComments | int | Maximum number of comments to retrieve per post | 20 |
maxCommunitiesCount | int | Maximum number of communities to retrieve | 2 |
maxUserCount | int | Maximum number of users to retrieve | 2 |
Usage
The Reddit Scraper Tool is used by the Researcher Agent to gather community discussions about the specified topic:
Return Value
The tool returns a list of Reddit posts and comments, where each item is a dictionary containing information about a post or comment, including:
title
: The title of the post (for posts only)text
: The text content of the post or commenturl
: The URL of the post or commentauthor
: The username of the authorscore
: The score (upvotes - downvotes) of the post or commentcreated
: The creation date of the post or comment- Additional metadata about the post or comment
Apify Integration
The tool uses an Apify Reddit scraper actor, which provides several advantages:
- Scalability: The actor can handle large numbers of Reddit searches efficiently
- Reliability: The actor is designed to handle rate limiting and other issues that can arise when scraping Reddit
- Structured Data: The actor returns Reddit posts and comments in a structured format that is easy to process
Configuration
To use the Reddit Scraper Tool, you need to set up the following environment variables:
Next Steps
- Learn about the Twitter Scraper Tool
- Explore the Researcher Agent that uses this tool
- See how this tool contributes to the newsletter generation process