Hello World!
My scientific papers are available on Google Scholar
You can find more information about my projects and publications at ResearchGate, ORCID and Lattes
Reach me at madsondeluna@gmail.com, madsondeluna@ufmg.br or madsondeluna@usp.br
You can also connect with me on Linkedin or X
Visit my website: https://madsondeluna.github.io/
PhD Student in Bioinformatics @ UFMGMBA Student in Software Engineering @ USP
Spc Student in Data Science & Analytics @ PUC
MSc in Genetics & Molecular Biology (2024) @ UFPE
BSc in Biomedical Sciences (2022) @ UFPE
Tech in Software Development (2013) @ ETEPE
(i) AMPidentifier - Antimicrobial Peptide Prediction Toolkit @ GitHub
Open-source Python toolkit for predicting antimicrobial peptides using ensemble machine learning (RF, SVM, GB) with 88.45% accuracy. Features automated physicochemical descriptor extraction, modular architecture for external model integration, and CLI-based reproducible workflows. Registered at INPI (BR 51 2025 005859-4).
Tech Stack: Python, Scikit-learn, StandardScaler, Machine Learning, CLI
(ii) GETVar - Genetic Variant Annotation Tool @ GitHub
Web-based application for clinical genetic variant analysis and annotation. Processes VCF files through automated Snakemake workflows, integrating data from dbSNP, ClinVar, and Ensembl to provide comprehensive variant annotations including clinical significance, population frequencies, and functional consequences for diagnostic support.
Tech Stack: Python, Flask, Snakemake, Bootstrap, REST APIs (dbSNP, ClinVar, Ensembl)
(iii) Data Science & Analytics Specialization - Three-Module MVP Suite (@ Module I + @ Module II + @ Module III)
Comprehensive project series developed as mandatory evaluation requirements for the Data Science and Analytics Specialization at Pontifical Catholic University of Rio de Janeiro (PUC-Rio).
Module I: Exploratory data analysis and preprocessing of the Wisconsin Breast Cancer Dataset with K-NN classification achieving 95.8% accuracy.
Module II: Comparative evaluation of machine learning algorithms (K-NN, Random Forest, SVM, XGBoost) for tumor classification, with Random Forest and SVM reaching 98.6% accuracy and zero false negatives.
Module III: Data warehouse engineering for plant eIF4E proteins featuring automated ETL pipeline, star schema design with 1,247 protein sequences across 450+ species, and interactive web interface for taxonomic, functional, and viral resistance analysis.
Tech Stack: Python (Pandas, NumPy, Scikit-learn), SQL & SQLite, ETL Pipeline Development, REST API Integration, Data Modeling (Star Schema, Dimensional Design, Relational), Git/GitHub (Version Control, Branching Strategies, CI/CD), Databricks, Jupyter Notebooks & Google Colab, Data Quality Management & Validation, Data Cataloging & Lineage Tracking, Web Development (JavaScript, D3.js, Chart.js, HTML, CSS), Cloud Deployment (GitHub Pages), Reproducible Workflows
Core Competencies: Data Engineering & Governance, Data Warehousing & Architecture, Machine Learning, Data Analysis & Visualization, Version Control & DevOps
Agile Project Management Professional Certificate - Atlassian - 2025
Career Essentials in Project Management - Microsoft - 2025
Data-Driven Product Management - NASBA - 2025
Microsoft Azure Al Essentials: Workloads and Machine Learning - Microsoft - 2025
Requirements Engineering and Agile Product Management - PUC-Rio - 2025
The Data Science of Healthcare, Medicine, and Public Health - LinkedIn Learning - 2025
Advanced Gemini for Developers - Deep Mind / Google - 2024
Career Essentials in GitHub Professional Certificate - GitHub - 2024
Project Management - Project Management Institute - 2024
Python Programming from Basic to Advanced - Udemy - 2022
Bioinformatics with Python - Udemy - 2022
(+) More Achievements Here



