Ph.D. candidate specializing in Computer Vision and Multimodal AI. Research expertise in vision-language models, generative models for high-resolution image synthesis, and semantic segmentation under domain shift challenges. Proficient in LLM/VLM development with hands-on experience in Model Context Protocol (MCP) integration, vLLM deployment, and LoRA-based supervised fine-tuning. Strong track record of translating research into industrial solutions for OPPO, Bosch, and NTT.
Generative Models (Diffusion/Flow Matching), Multimodal Learning (Qwen3-VL, Deepseek-OCR), LLM Agents & Tool Use, Semantic Segmentation, Semi-Supervised Learning, Multi-Object Tracking (MOT)
For a complete list of publications, please visit the Publications page.