Skip to content

A Node.js tool for extracting and analyzing author information from PubMed articles. Features include publication filtering, author affiliation tracking, and CSV report generation. Perfect for researchers and bibliometric analysis.

Notifications You must be signed in to change notification settings

m91michel/pubmed-authors-fetcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PubMed Author Extraction Tool 📚

A Node.js tool that extracts author information and their affiliations from PubMed articles based on search criteria. The tool supports filtering by publication types and date ranges, making it easy to analyze research contributions in specific fields.

Features 🌟

  • Search PubMed articles with custom queries
  • Filter by publication types (e.g., Clinical Trials, Meta-Analyses)
  • Filter by publication date range
  • Extract author names and their affiliations
  • Generate CSV reports with author details
  • Handle large result sets with pagination
  • Respect PubMed API rate limits
  • Configurable search parameters

Prerequisites 📋

  • Node.js (v12 or higher)
  • npm or yarn package manager
  • PubMed API key (get one from NCBI)

Installation 🔧

  1. Clone the repository:
git clone <repository-url>
cd pubmed-script
  1. Install dependencies:
npm install
# or
yarn install
  1. Configure your settings in src/config.js:
module.exports = {
    API_KEY: 'your-api-key-here',
    SEARCH_TERMS: {
        QUERY: "your search term",
        PUBLICATION_TYPES: [
            "Clinical Trial",
            "Meta-Analysis"
        ]
    },
    PUBLICATION_YEARS: {
        START: '2024',
        END: '2024'
    }
    // ... other settings
};

Usage 🚀

Run the script:

npm start
# or
yarn start

The tool will:

  1. Search PubMed for articles matching your criteria
  2. Extract author information and affiliations
  3. Generate a CSV file with the results

Configuration Options ⚙️

Edit src/config.js to customize:

  • API_KEY: Your PubMed API key
  • SEARCH_TERMS:
    • QUERY: Your search term
    • PUBLICATION_TYPES: Array of publication types to filter
  • PUBLICATION_YEARS: Date range for publications
  • MAX_RESULTS: Maximum number of results to fetch (default: 10000)
  • BATCH_SIZE: Number of articles to process per batch (default: 20)
  • DELAY_MS: Delay between API requests in milliseconds (default: 1000)

Output Format 📄

The tool generates a CSV file with the following columns:

  • Author Name
  • Affiliations (semicolon-separated)
  • Titles (semicolon-separated with publication years)

Example output file name:

Environment Setup 🔑

  1. Copy the example environment file:
cp .env.example .env
  1. Edit .env and add your PubMed API key:
PUBMED_API_KEY=your_api_key_here

You can get your API key from NCBI

About

A Node.js tool for extracting and analyzing author information from PubMed articles. Features include publication filtering, author affiliation tracking, and CSV report generation. Perfect for researchers and bibliometric analysis.

Topics

Resources

Stars

Watchers

Forks

Contributors 2

  •  
  •