A demo tool to remove sensitive information from a supplied file using the Skyflow Detect (NER-based) API.
This script leverages the Skyflow Detect API to demonstrate the power of Skyflow’s API-first approach for securely de-identifying sensitive data from various file formats.
Before running the script, ensure that you have the necessary dependencies installed.
Ensure Python is installed on your system. You can check by running:
python --versionor
python3 --versionInstall the required libraries using pip or pip3:
pip install requests PyJWT python-docxThe script supports de-identification of the following file types:
- Documents:
pdf,doc,docx,txt,json,xml,csv,xls,xlsx,ppt,pptx - Images:
bmp,jpeg,jpg,png,tif,tiff - Audio:
mp3,wav
The tool interacts with the following Skyflow Detect API endpoint:
{{url}}/v1/detect/deidentify/file- Ensure you have a Skyflow Try Environment account. If not, contact Skyflow to create one for you.
- Log in as a vault owner or administrator.
- Navigate to your account and create a new Service Account.
- The service account must have the following assignments and roles:
- Assignment: Account-Level Role → "Account Admin"
- Assignment: Workspace-Level Role → "Vault Creator" and "Workspace Admin"
- Save the settings and generate a credentials.json file. This file is required when running the script.
- In the
filesDetectrepository, locate the filedetect_params.json. - Update all relevant parameter values to match your Skyflow account and environment settings.
- Specify the location of the Common_Files_Directory in the parameters file. The script relies on common functions from this directory.
Once setup is complete, you can run the filesDetect2.0 de-identifier script. Sample files are available in the detectSampleFiles directory for testing.
- Open your terminal.
- Navigate to the
filesDetectdirectory. - Run the following command:
python3 filesDetect2.0.py
- Follow the on-screen prompts to process your files.
This repository includes a collection of sample files located in the detectSampleFiles directory. Use these for testing and to see the Skyflow Detect tool in action.
- Ensure your
credentials.jsonfile is correctly configured before running the script. - Make sure the
detect_params.jsonfile is updated with accurate values for your Skyflow environment.