Skip to content

kamalkraj/MedGem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

218 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

icon

MedGEM

Kotlin Android Jetpack Compose ExecuTorch LiteRT ObjectBox ONNX

MEDGEM is an offline inference engine for MedGemma-1.5-4B, a medical-domain Multimodal Large Language Model (MLLM). It enables secure, high-performance inference on local devices without requiring internet connectivity.

Device Requirements

  • Android Version: Android 14 (API 34) or higher.
  • RAM: 12GB or higher (required for 4B parameter models). (Note: The app will be able load the text model on devices with 8GB RAM with vision encoder disabled, but the RAG module will require at least 12GB for optimal performance.)
  • Storage: ~4GB free space for model checkpoints.

Features

  • Offline Inference: Run models locally for maximum data privacy and security.
  • Multimodal Support: Process both text and visual medical data (X-Rays, scans). Vision Encoder can be disabled to save memory on lower-end devices.
  • RAG (Retrieval-Augmented Generation): Integrate external knowledge bases to improve accuracy and reduce hallucinations. The app comes pre-loaded with a medical knowledge base.
  • Patient Management & SOAP Notes: Create patient profiles and generate structured SOAP (Subjective, Objective, Assessment, Plan) notes from visit data using AI.
  • Thinking Mode: Enable Chain-of-Thought reasoning (Gemini-style thinking) for complex medical queries to get more reasoned responses.
  • Knowledge Search: Perform direct semantic searches on the medical database to find relevant information without starting a chat.
  • Protocol Viewer: Quick offline access to essential medical PDF protocols.
  • Customizable Inference: Fine-tune generation parameters (Temperature, Top-P, Max Tokens), set custom System Prompts, and adjust Chunk Sizes for performance optimization.

πŸ“Š Model Evaluation

We evaluated our on-device quantized models against their original HuggingFace counterparts to ensure minimal quality loss during edge deployment.

Model On-Device Format HF Reference Result
MedGemma 1.5 4B ExecuTorch (8da4w) google/medgemma-1.5-4b-it βœ… Clinically Equivalent
MedAsr ONNX int8 (sherpa-onnx) google/medasr βœ… 0.00% WER
EmbeddingGemma 300M LiteRT int8 (TFLite) google/embeddinggemma-300m βœ… 0.9987 Cosine Sim

For detailed methodology, see the Evaluation Guide and Full Evaluation Report.

πŸ“š Documentation

We have detailed documentation available in the docs/ directory:

πŸš€ Quick Start

  1. Install Prerequisites: Android Studio, SDK/NDK, uv, git.
  2. Download Checkpoints:
    uv tool install hf
    hf auth login
    
    # Download models (LLM, ASR, and Embedding)
    hf download kamalkraj/medgemma-1.5-4b-it-executorch --local-dir models/llm
    hf download kamalkraj/medasr-onnx --local-dir models/asr
    hf download kamalkraj/embeddinggemma-300m-litert --local-dir models/embedding
  3. Push to Device: See SETUP.md for detailed adb push and internal directory move commands.
  4. Build & Run: Open in Android Studio and run on your device.

For detailed setup instructions, including manual model conversion and building AARs from source, please refer to SETUP.md.

RAG Data Ingestion

The application already includes a pre-built database (app/src/main/assets/initial_data.mdb). If you want to add additional PDFs to the knowledge base, refer to the RAG Ingestion Module.

Packages

 
 
 

Contributors