> ## Documentation Index
> Fetch the complete documentation index at: https://newsletter-ai-agent.pratikdani.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Google News Scraper Tool

> Learn about the Google News Scraper Tool used by the Newsletter AI Agent

# Google News Scraper Tool

The Google News Scraper Tool is a custom tool that allows the Newsletter AI Agent to gather the latest news articles related to a specified topic using the [Apify Super Fast Google News Scraper](https://apify.com/aymorato/super-fast-google-news-scraper-pay-per-result) actor.

## Overview

The Google News Scraper Tool is primarily used by the [Researcher Agent](/agents/researcher) to gather the latest news and developments about the specified topic. It provides a flexible interface for searching Google News and extracting structured data from news articles.

## Implementation

The Google News Scraper Tool is implemented as a CrewAI `BaseTool` that interacts with the Apify Super Fast Google News Scraper actor. Here's the implementation:

```python theme={null}
from crewai.tools import BaseTool
from pydantic import BaseModel, Field, ConfigDict
from typing import List, Optional, Literal
from apify import Actor
from src.tools.base import RunApifyActor

class GoogleNewsScraperInput(BaseModel):
    """Input schema for GoogleNewsScraper tool."""
    keywords: List[str] = Field(
        description="The keywords used to search for news articles"
    )

    maxItems: Optional[int] = Field(
        description="Set the maximum number of items you want to scrape for each keyword. If left unset, the actor will extract all available news.",
        default=20
    )

class GoogleNewsScraperTool(BaseTool):
    name: str = "Google News Scraper"
    description: str = "Tool for scraping Google News articles with configurable parameters"
    args_schema: type[BaseModel] = GoogleNewsScraperInput
    actor: Actor = Field(description="Apify Actor instance")
    model_config = ConfigDict(arbitrary_types_allowed=True)

    def _run(
        self,
        keywords: List[str],
        language: Optional[str] = "US:en",
        maxItems: Optional[int] = None
    ) -> str:
        run_inputs = {
            "keywords": keywords,
            "language": language
        }
        
        if maxItems:
            run_inputs["maxItems"] = maxItems

        proxy = {
            "useApifyProxy": True,
            "apifyProxyGroups": [
                "RESIDENTIAL"
            ]
        }
        run_inputs["proxy"] = proxy
        
        run_actor = RunApifyActor(self.actor)
        dataset = run_actor._run("aymorato/super-fast-google-news-scraper-pay-per-result", run_inputs)
        return dataset
```

## Parameters

The Google News Scraper Tool accepts the following parameters:

| Parameter  | Type       | Description                                              | Default  |
| ---------- | ---------- | -------------------------------------------------------- | -------- |
| `keywords` | List\[str] | The keywords used to search for news articles            | Required |
| `language` | str        | Language and country code for the search (e.g., "US:en") | "US:en"  |
| `maxItems` | int        | Maximum number of items to scrape for each keyword       | 20       |

## Usage

The Google News Scraper Tool is used by the Researcher Agent to gather the latest news about the specified topic:

```python theme={null}
# Initialize the tool
news_tool = GoogleNewsScraperTool(actor=actor)

# Use the tool
news_results = news_tool._run(
    keywords=[topic],
    language="US:en",
    maxItems=20
)
```

## Return Value

The tool returns a list of news articles, where each article is a dictionary containing information about the article, including:

* `title`: The title of the news article
* `link`: The URL of the news article
* `source`: The source of the news article (e.g., "CNN", "BBC")
* `publishedAt`: The date the article was published
* `snippet`: A brief snippet or summary of the article
* Additional metadata about the article

## Apify Integration

The tool uses the Apify Super Fast Google News Scraper actor, which provides several advantages:

1. **Scalability**: The actor can handle large numbers of news searches efficiently
2. **Reliability**: The actor is designed to handle rate limiting and other issues that can arise when scraping Google News
3. **Structured Data**: The actor returns news articles in a structured format that is easy to process
4. **Freshness**: The actor focuses on retrieving the latest news articles, ensuring that the information is up-to-date

## Configuration

To use the Google News Scraper Tool, you need to set up the following environment variables:

```
APIFY_API_KEY=your_apify_api_key_here
```

## Next Steps

* Explore the [Researcher Agent](/agents/researcher) that uses this tool
* Learn about the other tools used by the Newsletter AI Agent in the [Tools Overview](/tools/overview)
* See how this tool contributes to the [newsletter generation process](/features/newsletter-generation)
