PhD student at The Ohio State University working on understanding and controlling large language models. I also vibe code research prototypes, developer tools, and AI systems.
-
I am a PhD student in Computer Science & Engineering at The Ohio State University.
-
My research focuses on mechanistic interpretability, and sparse representations in LLMs.
-
I study how internal representations encode concepts and how we can steer model behavior with interpretable directions.
-
My goal is to make LLMs more transparent, controllable, and reliable.
-
Understanding Linear Steering (ongoing)
Investigating the geometry, linearity, and causal structure of steering directions in LLM representation space. -
AbsTopK: Rethinking Sparse Autoencoders For Bidirectional Features ArXiv, OpenReview
Developed a principled proximal-gradient framework that unifies SAE variants (ReLU, JumpReLU, TopK) and reveals that non-negativity constraints prevent bidirectional feature representation. Proposed AbsTopK, a magnitude-based sparse operator that recovers complete semantic axes and improves interpretability and steering in LLMs. -
From Emergence to Control: Probing and Modulating Self-Reflection in Language Models Arxiv
Showed that linear directions in representation space can enable and control self-reflection behavior in pretrained LLMs without finetuning.
For a full list of publications:
If you are interested in collaboration, feel free to open an issue or connect with me.