> ## Documentation Index
> Fetch the complete documentation index at: https://newsletter-ai-agent.pratikdani.com/llms.txt
> Use this file to discover all available pages before exploring further.

# YouTube Scraper Tool

> Learn about the YouTube Scraper Tool used by the Newsletter AI Agent

# YouTube Scraper Tool

The YouTube Scraper Tool is a custom tool that allows the Newsletter AI Agent to find relevant video content on YouTube using an Apify actor. It provides a way to gather video-based information and insights related to the specified topic.

## Overview

The YouTube Scraper Tool is primarily used by the [Researcher Agent](/agents/researcher) to gather video content about the specified topic. It provides a flexible interface for searching YouTube and extracting structured data from videos, channels, and playlists.

## Implementation

The YouTube Scraper Tool is implemented as a CrewAI `BaseTool` that interacts with an Apify YouTube scraper actor. Here's the implementation:

```python theme={null}
from crewai.tools import BaseTool
from pydantic import BaseModel, Field, ConfigDict
from typing import List, Optional
from apify import Actor
from src.tools.base import RunApifyActor

class YouTubeScraperInput(BaseModel):
    """Input schema for YouTubeScraper tool."""
    searchQueries: Optional[List[str]] = Field(
        description="Search terms just like you would enter in YouTube's search bar"
    )
    
    maxResultsShorts: Optional[int] = Field(
        default=0,
        description="Limit the number of Shorts videos to crawl"
    )
    
    maxResultStreams: Optional[int] = Field(
        default=0,
        description="Limit the number of Stream videos to crawl"
    )
    
    startUrls: Optional[List[str]] = Field(
        default=[],
        description="Direct URLs to YouTube videos, channels, playlists, hashtags or search results"
    )
    
    # Additional parameters...

class YouTubeScraperTool(BaseTool):
    name: str = "YouTube Scraper"
    description: str = "Tool for scraping YouTube videos, channels, playlists with configurable parameters"
    args_schema: type[BaseModel] = YouTubeScraperInput
    actor: Actor = Field(description="Apify Actor instance")
    model_config = ConfigDict(arbitrary_types_allowed=True)
    
    def _run(
        self,
        searchQueries: Optional[List[str]] = None,
        maxResultsShorts: Optional[int] = 0,
        maxResultStreams: Optional[int] = 0,
        startUrls: Optional[List[str]] = [],
        # Additional parameters...
    ) -> str:
        run_inputs = {}
        
        if searchQueries:
            run_inputs["searchQueries"] = searchQueries
        if maxResultsShorts:
            run_inputs["maxResultsShorts"] = maxResultsShorts
        if maxResultStreams:
            run_inputs["maxResultStreams"] = maxResultStreams
        if startUrls:
            run_inputs["startUrls"] = startUrls
        # Set additional parameters...
        
        run_actor = RunApifyActor(self.actor)
        dataset = run_actor._run("youtube-scraper-actor-name", run_inputs)
        return dataset
```

## Parameters

The YouTube Scraper Tool accepts the following parameters:

| Parameter                      | Type       | Description                                        | Default  |
| ------------------------------ | ---------- | -------------------------------------------------- | -------- |
| `searchQueries`                | List\[str] | Search terms for YouTube's search bar              | Required |
| `maxResultsShorts`             | int        | Limit the number of Shorts videos to crawl         | 0        |
| `maxResultStreams`             | int        | Limit the number of Stream videos to crawl         | 0        |
| `startUrls`                    | List\[str] | Direct URLs to YouTube videos, channels, playlists | \[]      |
| `downloadSubtitles`            | bool       | Download subtitles for videos                      | False    |
| `saveSubsToKVS`                | bool       | Save downloaded subtitles to key-value store       | False    |
| `subtitlesLanguage`            | str        | Language for subtitles download                    | "any"    |
| `preferAutoGeneratedSubtitles` | bool       | Prefer auto-generated subtitles                    | False    |
| `subtitlesFormat`              | str        | Format for subtitle downloads                      | "srt"    |
| `sortingOrder`                 | str        | How to sort the results                            | None     |
| `dateFilter`                   | str        | Filter results by date                             | None     |
| `videoType`                    | str        | Filter by video type                               | None     |
| `lengthFilter`                 | str        | Filter by video length                             | None     |
| `isHD`                         | bool       | Filter for HD videos                               | None     |
| `hasSubtitles`                 | bool       | Filter for videos with subtitles                   | None     |

## Usage

The YouTube Scraper Tool is used by the Researcher Agent to gather video content about the specified topic:

```python theme={null}
# Initialize the tool
youtube_tool = YouTubeScraperTool(actor=actor)

# Use the tool
youtube_results = youtube_tool._run(
    searchQueries=[topic],
    maxResultsShorts=0,
    maxResultStreams=0,
    sortingOrder="relevance",
    dateFilter="last_month"
)
```

## Return Value

The tool returns a list of YouTube videos, where each video is a dictionary containing information about the video, including:

* `title`: The title of the video
* `url`: The URL of the video
* `description`: The description of the video
* `channelName`: The name of the channel that uploaded the video
* `channelUrl`: The URL of the channel
* `viewCount`: The number of views the video has
* `publishedAt`: The date the video was published
* `duration`: The duration of the video
* Additional metadata about the video

## Apify Integration

The tool uses an Apify YouTube scraper actor, which provides several advantages:

1. **Scalability**: The actor can handle large numbers of YouTube searches efficiently
2. **Reliability**: The actor is designed to handle rate limiting and other issues that can arise when scraping YouTube
3. **Structured Data**: The actor returns YouTube videos in a structured format that is easy to process
4. **Advanced Filtering**: The actor supports advanced filtering options to narrow down search results

## Configuration

To use the YouTube Scraper Tool, you need to set up the following environment variables:

```
APIFY_API_KEY=your_apify_api_key_here
```

## Next Steps

* Learn about the [Google News Scraper Tool](/tools/google-news)
* Explore the [Researcher Agent](/agents/researcher) that uses this tool
* See how this tool contributes to the [newsletter generation process](/features/newsletter-generation)
