Focusing on LLMs for Healthcare, Genomics, and Medical Imaging
My work sits at the intersection of Artificial Intelligence and Biomedicine. I specialize in deploying Large Language Models (LLMs) for clinical workflows and utilizing Foundation Models for genomic sequence analysis.
- Genomic AI: Zero-shot inference with Evo2-40B (StripedHyena architecture).
- Clinical NLP: Automating patient recruitment with LLM agents.
- Medical Imaging: 3D Point Cloud reconstruction & Bayesian Optimization.
| Domain | Toolkit |
|---|---|
| LLMs | Transformers LLMs Prompt Engineering Agent |
| Deep Learning | DNN Gaussian Processes Bayesian Optimization |
| Data Eng & HPC | SQL ETL Pipelines |
| Languages |
- Zero-shot Inference: Deployed NVIDIA's Evo2-40B (StripedHyena 2 architecture) for unsupervised recognition of bacterial sequence features.
- HPC Optimization: Built a large-scale parallel inference pipeline on High-Performance Computing (HPC) clusters.
- Innovation: Implemented Windowed Inference and temperature calibration to solve instability issues in long-sequence reasoning, significantly improving robustness.
- Patient Recruitment Agent: Led the development of an LLM-based system to extract structured data from unstructured clinical texts.
- Performance: Designed advanced Prompt Engineering strategies and semantic matching algorithms, boosting patient-trial matching efficiency by 30%.
- Algorithm Design: Developed a 3D point cloud reconstruction framework using Gaussian Processes to replace manual parameter tuning.
- Results: Reduced geometric reconstruction error to 2.8% and improved tuning efficiency by 50% through a custom multi-objective loss function.
Parexel (Clinical Research Organization) | AI/LLM Application Intern
- Focused on automating clinical trial operations using Generative AI.
- Optimized rule-based and semantic matching algorithms for patient screening.
SAS Institute | Data Analytics Intern
- Built high-dimensional Neural Network (DNN) models (AUC 0.86).
- Developed automated ETL pipelines processing 100k+ records using Python/SQL.