Skip to content

ttsudipto/pubchem_data_retriever

Repository files navigation

PubChem Data Retriever

This is a simple tool for retrieving tabular data from the PubChem database. This is a PHP-based web application which needs to be deployed in a web server (e.g., Apache HTTP or XAMPP). The user interface in provided using HTML. Users can provide a list of compound names. The tool first finds the CIDs matching the compound names. Then the tool retrieves the properties of the compounds and displays it, which can be downloaded in comma-separated values (CSV) format. This tool uses the PUG REST API of PubChem.

Screenshot

Pre-requisites

  • Apache HTTP server
  • cURL
  • PHP
  • Web browser
  • Internet connection

Installation

Install Apache HTTP server, cURL, PHP

(For ubuntu)

sudo apt-get install apache2 curl phpX.Y phpX.Y-curl 

X.Y denotes the version of PHP (e.g., php8.3). Check the current version available in the package manager of your distribution. You may be required to install other dependencies for these packages.

(For other distributions)

Please follow the installation steps in the documentation of the respective packages.

Start the Apache HTTP server

sudo service apache2 start

To check the status, sudo service apache2 status. To restart the server sudo service apache2 restart.

Check PHP installation

php -v

Install PubChem Data Retriever

Step 1: Clone the repository

git clone https://github.com/ttsudipto/pubchem_data_retriever

Step 2: Configure the list of compound properties to be retrieved.

Modify the compound properties to be retrieved by commenting particular lines (adding/removing the # in the start of the line) in the compound_properties_config.txt file. The details of the different properties are available here

Step 3: Copy the repository to the Apache HTTP document root directory. In Ubuntu, the default location is /var/www/html.

sudo cp -r pubchem_data_retriever/ /var/www/html

Step 4: Launch the web application

Open your favorite web browser and go to the address localhost/pubchem_data_retriever.

N.B. - To configure the list of compound properties again, the compound_properties_config.txt file in the Apache HTTP server document root directory must be edited using superuser privileges. Using the nano text editors is a good option to do this.

sudo nano /var/www/html/pubchem_data_retriever/compound_properties_config.txt

Edit the file, press Ctrl+X, press Y, press Enter.

Acknowledgment

I acknowledge Dr. Sudipto Saha and the Department of Biological Sciences, Bose Institute, Kolkata, India for providing the resources to develop this tool.