Skip to content

GenjiKuto/google-news-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Google News Scraper

A powerful Google News scraper designed to deliver real-time, structured news data from global sources. It enables analysts, researchers, and developers to efficiently monitor trends, track topics, and collect multilingual news at scale. This scraper provides fast, reliable, and high-quality access to Google News data across 70+ regions and languages.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Google News Scraper you've just found your team — Let’s Chat. 👆👆

Introduction

The Google News Scraper automates the process of collecting fresh, structured news data directly from Google News. It solves the challenge of constant manual monitoring by providing an automated, high-speed, multilingual data extraction workflow. Ideal for researchers, journalists, analysts, and businesses that rely on timely news insights.

How This Scraper Helps

  • Tracks multiple keywords across global Google News editions.
  • Extracts high-resolution images, descriptions, timestamps, and metadata.
  • Supports URL decoding to reveal original sources.
  • Works across 70+ region/language combinations.
  • Optimized for stability, speed, and parallel processing.

Features

Feature Description
Smart URL Decoder Reveals original article URLs from Google News redirect links.
Fast Description Extraction Retrieves article descriptions using anti-blocking strategies.
Multi-Keyword Scraping Search multiple topics in one execution.
Flexible Time Frames Supports 1h, 1d, 7d, 30d, 1y, and all-time ranges.
Global Coverage Works with 70+ international Google News locales.
High-Resolution Images Extracts 800x400 article images automatically.
Proxy Support Built-in smart proxy rotation for stability and speed.
Intelligent Error Handling Ensures smooth performance with auto-retry logic.

What Data This Scraper Extracts

Field Name Field Description
title Title of the news article.
source Publisher or news outlet name.
url Direct URL to the article.
description Extracted summary or description of the article.
publishedAt ISO timestamp of publication.
publishedTimestamp Unix timestamp of publication time.
image High-resolution article image URL.
metadata Additional details such as region, language, keyword, and scrape timestamp.

Example Output

[
  {
    "title": "Bitcoin Hits New All-Time High",
    "source": "Financial Times",
    "url": "https://ft.com/article/...",
    "publishedAt": "2025-02-22T12:41:25.936Z",
    "publishedTimestamp": 1740283285936,
    "image": "https://news.google.com/images/article.jpg",
    "description": "Bitcoin jumps 20% after Trump hints at new strategic reserve",
    "metadata": {
      "scrapeTimestamp": "2025-02-22T12:41:25.936Z",
      "language": "fr",
      "region": "FR",
      "keyword": "bitcoin",
      "timeframe": "1d"
    }
  }
]

Directory Structure Tree

Google News Scraper/
├── src/
│   ├── runner.py
│   ├── extractors/
│   │   ├── google_news_parser.py
│   │   └── utils_time.py
│   ├── outputs/
│   │   └── exporters.py
│   └── config/
│       └── settings.example.json
├── data/
│   ├── input.sample.json
│   └── sample_output.json
├── requirements.txt
└── README.md

Use Cases

  • Market analysts use it to track real-time financial news so they can detect trends early.
  • Brand monitoring teams use it to collect global mentions ensuring rapid reputation management.
  • Researchers use it to gather multilingual news for cross-regional analysis.
  • Journalists use it to stay updated on emerging topics across multiple regions.
  • Data engineers integrate it into pipelines to automate news aggregation.

FAQs

Q: Can it scrape news in non-English languages? Yes, it supports more than 70 region/language pairs, enabling fully multilingual data extraction.

Q: Does it retrieve the original article URL? When URL decoding is enabled, the scraper resolves Google News redirect links into their true article sources.

Q: How many keywords can I track at once? You can provide as many keywords as needed; the scraper processes them efficiently through parallel execution.

Q: What proxy options are supported? You can choose residential, datacenter, or no proxy, depending on your speed and stability requirements.


Performance Benchmarks and Results

Primary Metric: The scraper processes up to several hundred news articles per minute using optimized parallel requests. Reliability Metric: Maintains a 98% success rate on repeated keyword runs with intelligent retry handling. Efficiency Metric: Uses minimal bandwidth through selective resource loading and smart caching logic. Quality Metric: Consistently returns complete, high-resolution data including images, metadata, and descriptions across diverse regions.

Book a Call Watch on YouTube

Review 1

“Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time.”

Nathan Pennington
Marketer
★★★★★

Review 2

“Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on.”

Eliza
SEO Affiliate Expert
★★★★★

Review 3

“Exceptional results, clear communication, and flawless delivery. Bitbash nailed it.”

Syed
Digital Strategist
★★★★★