Twitter Scraper Tool

The Twitter Scraper Tool is a custom tool that allows the Newsletter AI Agent to collect tweets related to a specified topic using the Apify Twitter Scraper Lite actor.

Overview

The Twitter Scraper Tool is primarily used by the Researcher Agent to gather social media insights about the specified topic. It provides a flexible interface for searching Twitter and extracting structured data from tweets.

Implementation

The Twitter Scraper Tool is implemented as a CrewAI BaseTool that interacts with the Apify Twitter Scraper Lite actor. Here’s the implementation:

Copy
from crewai.tools import BaseTool
from pydantic import BaseModel, Field, ConfigDict
from typing import List, Optional
from apify import Actor
from src.tools.base import RunApifyActor

class TwitterScraperInput(BaseModel):
    """Input schema for TwitterScraper tool."""
    searchTerms: Optional[List[str]] = Field(
        description="Search terms to find tweets containing these terms. Alternative to using Twitter URLs.",
        default=None
    )

    sort: Optional[str] = Field(
        description="How to sort the returned tweets. Setting to 'Latest' yields more results.",
        default="Top",
        enum=["Top", "Latest"]
    )

    start: Optional[str] = Field(
        description="Scrape tweets starting from this date (format: YYYY-MM-DD)",
        default=None
    )

    end: Optional[str] = Field(
        description="Scrape tweets until this date (format: YYYY-MM-DD)", 
        default=None
    )

class TwitterScraperTool(BaseTool):
    name: str = "Twitter Scraper"
    description: str = "Tool for scraping Twitter content with configurable parameters"
    args_schema: type[BaseModel] = TwitterScraperInput
    actor: Actor = Field(description="Apify Actor instance")
    model_config = ConfigDict(arbitrary_types_allowed=True)

    def _run(
        self,
        searchTerms: Optional[List[str]] = None,
        sort: Optional[str] = "latest",
        start: Optional[str] = None,
        end: Optional[str] = None
    ) -> str:
        run_inputs = {}
        
        if searchTerms:
            run_inputs["searchTerms"] = searchTerms
        if sort:
            run_inputs["sort"] = sort
        run_inputs["maxItems"] = 5
        if start:
            run_inputs["start"] = start
        if end:
            run_inputs["end"] = end

        run_actor = RunApifyActor(self.actor)
        dataset = run_actor._run("apidojo/twitter-scraper-lite", run_inputs)
        return dataset

Parameters

The Twitter Scraper Tool accepts the following parameters:

Parameter	Type	Description	Default
`searchTerms`	List[str]	Search terms to find tweets containing these terms	None
`sort`	str	How to sort the returned tweets (“Top” or “Latest”)	“Top”
`start`	str	Scrape tweets starting from this date (YYYY-MM-DD)	None
`end`	str	Scrape tweets until this date (YYYY-MM-DD)	None

Usage

The Twitter Scraper Tool is used by the Researcher Agent to gather social media insights about the specified topic:

Copy
# Initialize the tool
twitter_tool = TwitterScraperTool(actor=actor)

# Use the tool
twitter_results = twitter_tool._run(
    searchTerms=[topic],
    sort="latest",
    start="2023-01-01",  # Optional: start date
    end="2023-12-31"     # Optional: end date
)

Return Value

The tool returns a list of tweets, where each tweet is a dictionary containing information about the tweet, including:

text: The text content of the tweet
url: The URL of the tweet
username: The username of the tweet author
timestamp: The timestamp when the tweet was posted
likes: The number of likes the tweet received
retweets: The number of retweets the tweet received
Additional metadata about the tweet

Apify Integration

The tool uses the Apify Twitter Scraper Lite actor, which provides several advantages:

Scalability: The actor can handle large numbers of Twitter searches efficiently
Reliability: The actor is designed to handle rate limiting and other issues that can arise when scraping Twitter
Structured Data: The actor returns tweets in a structured format that is easy to process
No API Key Required: Unlike the official Twitter API, the actor doesn’t require API keys for basic functionality

Configuration

To use the Twitter Scraper Tool, you need to set up the following environment variables:

APIFY_API_KEY=your_apify_api_key_here

Next Steps

Learn about the YouTube Scraper Tool
Explore the Researcher Agent that uses this tool
See how this tool contributes to the newsletter generation process

Tools

​Twitter Scraper Tool

​Overview

​Implementation

​Parameters

​Usage

​Return Value

​Apify Integration

​Configuration

​Next Steps