Skip to content
View sayantan007pal's full-sized avatar

Block or report sayantan007pal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sayantan007pal/readme.md

An image of @sayantanpal100's Holopin badges

Hi πŸ‘‹, I'm Sayantan Pal

Associate Software Engineer | AI/ML Enthusiast | Digital VLSI Developer

sayantan007pal


πŸš€ About Me

I'm a passionate technologist currently working as an Associate Software Engineer at Prismforce, where I architect scalable recruitment platforms and AI-powered solutions. My journey spans from digital VLSI design to full-stack development, machine learning, and production-grade AI systems.

  • πŸ”­ Currently building SelectPrism - a high-scale recruitment platform handling 1,000+ concurrent AI interview sessions
  • πŸŽ“ B.Tech in Electronics and Communication Engineering from Jalpaiguri Government Engineering College (CGPA: 7.647/10.0)
  • πŸ’Ό 2+ years of experience in software development, AI/ML, and systems architecture
  • πŸ† Won $100 at Quine Quest 22 for an AI-powered document summarizer
  • πŸ“ Published research at ICDEC-2025 and NCRTST-2025 on ML-based cardiac risk prediction and FPGA timing systems
  • 🌱 Deep learning into Deep Learning, Computer Vision, and VLSI Design

πŸ’» Current Work @ Prismforce

Building SelectPrism from Scratch (Feb 2025 - Present)

  • Architected a production-grade recruitment platform using Node.js, TypeScript, Python (FastAPI), MongoDB, and AWS
  • Engineered voice AI interview agents with WebRTC and LiveKit, achieving P95 latency under 200ms for STTβ†’LLMβ†’TTS pipeline
  • Developed a resume parser using Ollama with local LLMs, achieving 92% accuracy on 500+ manually labeled resumes
  • Implemented security hardening: Redis-based rate limiting (100 req/min), bcrypt authentication, Google reCAPTCHA
  • Built automated IVR campaigns via Ozonetel and Bull queue email system, scaling from 100 to 5,000+ daily notifications
  • Optimized API performance with MongoDB schema validation and RTK Query caching, reducing response time from 340ms to 220ms

πŸ† Featured Projects

Tech Stack: Next.js 14, Node.js, TypeScript, Redux Toolkit, MongoDB, AWS SQS, Claude AI

The most sophisticated project in my portfolio - A production-grade multi-agent system for AI-powered assessment question generation with advanced quality assurance.

Key Achievements:

  • Multi-Agent Architecture (V2): Implemented asynchronous message-passing system with Research Agent β†’ Question Generation Agent β†’ Judge Agent pipeline using AWS SQS for enterprise-scale reliability
  • Advanced Quality Control: Every question evaluated on 6 criteria (Requirements Alignment 25%, Research Accuracy 15%, Difficulty Match 15%, Uniqueness 15%, Clarity 15%, Industry Standards 15%) with configurable 85+ threshold
  • Intelligent Clarification System: Built conversational AI that achieves full context in 2+ rounds or when AI confirms sufficient information using multi-turn context management
  • Comprehensive Feature Engineering: 494 total features including non-linear transformations, statistical moments, network analysis, pathway scores, and complexity measures
  • Full-Stack Excellence: Next.js 14 App Router with Redux state management, real-time progress tracking, and responsive UI components
  • Production-Ready Infrastructure: LocalStack for development, containerized with Docker, comprehensive error handling and dead-letter queues

Technical Highlights:

  • Multiple LLM provider support (Claude, OpenAI, Gemini, Ollama) with factory pattern
  • Cloud-agnostic design (AWS/GCP/Azure) with provider abstraction
  • Microservices-ready architecture - each agent can be independently scaled
  • Real-time status polling with WebSocket-style updates
  • Advanced file generation (PDF, DOCX, JSON, CSV) with custom formatting

Impact: Transforms manual question creation (hours) β†’ automated high-quality generation (minutes) with iterative refinement until quality standards met.


Tech Stack: PyTorch, Python, UMAP, HDBSCAN, Optuna, Scikit-learn

Competition-winning deep learning solution achieving 0.8125+ silhouette score (baseline: 0.747, +8.7% improvement) for health state embedding discovery.

Key Innovations:

  • Multi-Modal Deep Learning Architecture:

    • Cytokine Encoder: Multi-head attention transformer (8 heads, 2 layers) for capturing complex cytokine relationships
    • Clinical Encoder: Specialized MLP for metabolic features
    • Temporal Encoder: Bidirectional GRU for longitudinal patterns
    • Cross-Modal Attention Fusion: Allows modalities to dynamically attend to each other
  • Advanced Contrastive Learning:

    • Combined loss function: NT-Xent (SimCLR-style) + Triplet Loss + Supervised Contrastive + Temporal Contrastive
    • Optimized weights: Supervised (50%) + Temporal (49%) dominant after 100 Optuna trials
    • Temperature scaling and hard negative mining for improved embedding quality
  • Systematic Hyperparameter Optimization:

    • 100 trials using Optuna TPE sampler
    • Discovered shallow architecture (2 layers) outperforms deep (3-4 layers)
    • Comprehensive search across 15+ hyperparameters
  • UMAP-64 Preprocessing Pipeline:

    • Dimensionality reduction from 256D β†’ 64D before clustering
    • Scientifically sound approach validated across multiple runs
    • Reproducible evaluation with fixed random seeds

Performance Metrics:

  • Validation Silhouette: 0.8125
  • Discovered 10+ distinct health state clusters
  • Noise: <15% with optimized HDBSCAN (min_cluster=20, min_samples=15, metric='manhattan')

Deliverables: Complete submission package with embeddings.csv, visualizations (UMAP 2D, t-SNE, cluster distributions), performance metrics, and comprehensive documentation.


Tech Stack: Observable Framework, D3.js, JavaScript, Python, Google Earth Engine

Climate adaptation research platform for infectious disease analysis with interactive data visualizations.

Features:

  • Observable notebooks with custom styling and IBM Plex Sans typography
  • Integration with Adaptation Atlas datasets (GAUL 2024 administrative boundaries, WMO watershed data)
  • Python + Google Earth Engine pipeline for soil and climate data processing
  • Responsive visualizations optimized for web and mobile
  • Export to standalone HTML for distribution

Impact: Enables data-driven insights for climate-health nexus research with accessible, shareable visualizations.


Tech Stack: Python, YOLOv8, PyTorch, OpenCV, CNNs

  • Fine-tuned YOLOv8 on custom tennis dataset for multi-object tracking across 1,200+ frames without ID loss
  • Trained PyTorch CNN for court keypoint detection achieving 92% accuracy on 17 keypoints per frame
  • Built end-to-end pipeline integrating detection, tracking, and keypoint models for real-time match analysis
  • Extracted player positions, court geometry, and movement analytics from match footage

Tech Stack: Python, FastAPI, Ollama, MongoDB, Docker, LocalStack

  • Developed complete recruitment pipeline: PDF parsing β†’ embedding generation β†’ vector search β†’ ranked candidate output
  • Validated with 200 recruiter-labeled job-resume pairs, achieving 88% top-5 recommendation accuracy
  • Containerized entire stack with Docker and simulated AWS (S3, SQS) locally using LocalStack for cost-efficient development
  • Implemented semantic search using Ollama embeddings for intelligent candidate-job matching

Tech Stack: React.js, Node.js, Express.js, CopilotKit, LangGraph, Material-UI

  • Built AI-powered resume interaction platform with multiple specialized CoAgents
  • Features: Resume evaluation, job description tailoring, and interview preparation simulation
  • Integrated LangGraph for multi-agent orchestration and conversation flow management
  • Clean, responsive Material-UI interface for seamless user experience

Tech Stack: Python, Flask, Machine Learning, Medical AI

  • ML-based diagnostic system for symptom analysis and disease prediction
  • User-friendly web interface for patient information input and diagnosis output
  • Integrated explainable AI for transparent prediction reasoning

Tech Stack: Python, Google Calendar API, CoAgent, Natural Language Processing

  • AI-powered calendar management system with natural language query understanding
  • OAuth2 integration with Google Calendar for seamless event management
  • CoAgent NLP capabilities for diverse user query interpretation
  • Automated scheduling, event creation, and calendar conflict resolution

Tech Stack: Python, Flask/FastAPI, React, MindsDB, REST APIs

  • Full-stack customer support application with AI-driven response generation
  • MindsDB integration for real-time ML-powered query handling
  • Live chat interface and ticket management system
  • Scalable architecture suitable for production deployment

Tech Stack: Python, Flask, Pydantic, OpenAI API, Daytona, Tailwind CSS

  • Built for Daytona Challenge 023 demonstrating streamlined dev environment management
  • AI-powered prompt responses using OpenAI integration
  • Pydantic for robust data validation and type safety
  • Responsive design with Tailwind CSS

Tech Stack: Node.js, Express.js, JWT, bcrypt, MongoDB

  • Secure authentication and authorization system with JWT implementation
  • Password hashing with bcrypt, email-based password reset functionality
  • Role-based access control (admin/user) and protected route management
  • Comprehensive error handling and security best practices

πŸŽ“ Research & Publications

πŸ“„ Published Papers

  1. "An Advanced Framework For Cardiac Risk Prediction And Real-Time Monitoring Using Machine Learning And IoT"

    • Presented at ICDEC-2025 (International Conference on Digital Electronics and Communications)
    • Developed ML-based system for real-time cardiac risk flagging using IoT sensor data
    • Integrated edge computing with cloud-based ML models for continuous health monitoring
  2. "FPGA-Based Precision Timing Generator for Cold Collision Experiments"

    • Published at NCRTST-2025 (National Conference on Recent Trends in Science and Technology)
    • Achieved timing precision of Β±1ns for quantum physics experimental setups
    • Implemented on FPGA for high-reliability, deterministic timing control

πŸ› οΈ Technical Skills

Languages

Python JavaScript TypeScript C C++ Verilog

Backend Development

  • Frameworks: Node.js, Express.js, FastAPI, Flask
  • APIs: REST APIs, WebRTC, LiveKit
  • Real-time: WebSockets, Server-Sent Events, Bull Queue

Frontend Development

  • Frameworks: React.js, Next.js
  • State Management: Redux Toolkit, RTK Query
  • Styling: Bootstrap, Tailwind CSS, Material-UI

Databases & Caching

  • NoSQL: MongoDB, Redis
  • SQL: PostgreSQL, MySQL
  • Vector DBs: Experience with embedding-based search

Machine Learning & AI

  • Frameworks: PyTorch, TensorFlow, Scikit-learn
  • Computer Vision: YOLOv8, OpenCV, CNN architectures
  • NLP: Ollama, LangChain, CopilotKit, LangGraph
  • Libraries: Pandas, NumPy, Seaborn, Matplotlib

DevOps & Cloud

  • Cloud Platforms: AWS (EC2, S3, SQS, Lambda)
  • Containerization: Docker, LocalStack
  • Workflow Orchestration: Apache Airflow
  • Version Control: Git, GitHub
  • CI/CD: GitHub Actions, automated deployment pipelines

Hardware & VLSI

  • Tools: MATLAB, Arduino
  • Design: Digital circuit design, FPGA programming
  • Testing: Logic analyzer, oscilloscope

πŸ“Š Previous Experience

Data Science Intern @ Celebal Technologies

May 2024 - July 2024 | Remote

  • Analyzed employee turnover patterns using K-means clustering
  • Discovered 40% lower turnover in mid-tenure employees (3-5 years) earning 27L-40L
  • Mapped salary-vs-experience retention curves and pitched restructuring to HR leadership
  • Delivered data-driven insights for talent retention strategy optimization

🌟 Open Source Contributions

πŸ† stdlib.js - Production DevOps Fix (12.5k ⭐)

PR #8600: Fix DevContainer Build Failures in GitHub Codespaces - MERGED βœ…

Impact: Fixed critical infrastructure issue affecting all 538+ contributors trying to use GitHub Codespaces for stdlib development.

The Problem:

  • DevContainer builds consistently failing with write error: no space left on device
  • 32GB Codespaces exhausted by massive 10GB+ universal base image
  • Broken ShellCheck dependency blocking container initialization
  • Python support missing despite being required for development

My Solution:

{
  "image": "mcr.microsoft.com/devcontainers/javascript-node:1-22-bookworm", // ⚑ 70% smaller
  "features": {
    "ghcr.io/devcontainers/features/python:1": {},                          // βœ… Restored
    "ghcr.io/devcontainers-extra/features/shellcheck:1": {},                // βœ… Fixed dependency
    "ghcr.io/rocker-org/devcontainer-features/r-apt:0": {},
    "ghcr.io/julialang/devcontainer-features/julia:1": {},
    "ghcr.io/rocker-org/devcontainer-features/pandoc:1": {}
  }
}

Technical Achievements:

  1. Optimized Base Image: Migrated from universal:2 (10GB+) to javascript-node:1-22-bookworm - reducing disk footprint by ~70%
  2. Fixed Broken Dependencies: Updated unmaintained ShellCheck feature (marcozac/) to actively maintained fork (devcontainers-extra/)
  3. Restored Python Support: Explicitly added Python feature that was missing from smaller base image
  4. Verified Multi-Language Support: Ensured Node.js, Python, R, Julia, ShellCheck, and Pandoc all working post-migration

Results:

  • βœ… Container builds successfully on standard 32GB Codespaces
  • ⚑ 3x faster rebuild times due to smaller image
  • πŸ”§ All required development tools functional
  • πŸ“Š Approved by 2 maintainers (@batpigandme, @Planeshifter)
  • 🎯 137/137 CI checks passed

Community Response:

"LGTM. I got a high CPU usage warning at one point, but build succeeds without a write error. Thanks for this fix!"
β€” @batpigandme (stdlib maintainer)

"Thank you @sayantan007pal for this PR; much appreciated!"
β€” @Planeshifter (stdlib core maintainer)

Skills Demonstrated:

  • DevOps troubleshooting in complex multi-language environments
  • Docker optimization and container image selection
  • Dependency management and upstream feature tracking
  • Cross-platform development environment setup (Node.js + Python + R + Julia)
  • GitHub Codespaces infrastructure understanding

Daytona (13.8k ⭐)

  • Pull Request #1545: Updated samples index for Daytona development environment manager
  • Contributed to open-source dev environment standardization project ($7M funded startup)

Active Community Member

  • Regular contributor to developer communities on DEV.to
  • Published tutorials on AI/ML, DevOps, and full-stack development
  • Mentored developers on Daytona, Fluvio, MindsDB, and CoAgent implementations

πŸ“ˆ GitHub Stats

sayantan007pal

sayantan007pal

sayantan007pal


πŸ… Achievements & Certifications

  • πŸ† $100 Winner - Quine Quest 22 for AI-powered document summarizer
  • πŸ“œ ICDEC-2025 Presenter - Cardiac Risk Prediction using ML and IoT
  • πŸ“œ NCRTST-2025 Publisher - FPGA-based Timing Generator
  • 🎯 Hackathon Participant - Daytona Challenge 023
  • πŸ’» Active Open Source Contributor - Multiple repositories across AI/ML domain

πŸ“« Connect With Me

LinkedIn Kaggle Instagram DEV.to Zindi


πŸ’‘ What I'm Learning

  • 🧠 Advanced Deep Learning architectures (Transformers, GANs, Diffusion Models)
  • πŸ”¬ Quantum Computing and Quantum ML
  • ⚑ Advanced VLSI Design and Verification
  • 🎯 MLOps and Production ML Systems
  • 🌐 Distributed Systems and Microservices Architecture

GitHub Trophies

⚑ "Building the future, one commit at a time" ⚑


πŸ“Œ Note: Currently exploring opportunities in AI/ML Engineering, Full-Stack Development, and VLSI Design roles. Open to collaborations on innovative projects!

Pinned Loading

  1. sayantan007pal-Library-management-system-using-node-mongo-express sayantan007pal-Library-management-system-using-node-mongo-express Public

    JavaScript

  2. Sayantan-s-Portfolio-website Sayantan-s-Portfolio-website Public

    TypeScript

  3. Ai-Document-Summarizer Ai-Document-Summarizer Public

    Python

  4. 19_bit_CPU_architecture_using_verilog 19_bit_CPU_architecture_using_verilog Public

    Verilog