This project provides a Python script to extract and consolidate lab test results from Excel files stored in SharePoint Online. It automates the process of downloading files, reading data, and combining it into a single dataset.
-
Install Dependencies: Install the required Python packages using pip:
pip install -r requirements.txt
-
Configure SharePoint Authentication: Set the following environment variables for SharePoint authentication:
SHAREPOINT_SITE_URL=YOUR_SHAREPOINT_SITE_URL SHAREPOINT_CLIENT_ID=YOUR_CLIENT_ID SHAREPOINT_CLIENT_SECRET=YOUR_CLIENT_SECRET
You can use a
.envfile and a library likepython-dotenvto manage these variables, or set them directly in your environment. -
Configure Script Variables: Update the following variables within the
sharepoint_etl.pyscript to match your specific SharePoint document library, folder path, and data format:SHAREPOINT_DOC_LIBRARYSHAREPOINT_FOLDER_PATHkey_columnsqc_patterns
-
Run the Script: Execute the script using the following command:
python sharepoint_etl.py