About Me

Hi, I am Ankur Singh,

I’m an AI Solutions Engineer at Intel, where I optimize cutting-edge AI workloads for Intel hardware. This deep dive into system-level performance tuning has sparked a passion for performance engineering in GenAI and distributed training, leading me to explore tools like Triton, torch.compile, Torch Distributed, and FSDP. I actively document my learnings in my Today I Learned (TIL) knowledge base.

I earned my Master’s in Software Engineering from San Jose State University in May 2024, where I worked as a research assistant under Dr. Wu and Dr. Liu at the SJSU Research Foundation. My research spanned Multi-Modal Learning, Federated Learning, Traffic Flow Prediction, and real-time edge deployment—giving me hands-on exposure to applied ML in highly constrained environments.

My journey into AI started earlier right after my undergrad. I founded AI Adventures—a startup delivering tailored AI/ML solutions. Later, at Zoop.One, I built a lean ML team that developed and deployed four production ML services, housing 20+ deep learning models and processing over 2 million requests/month—all within just 10 months. I focused on building robust ML infrastructure, optimizing developer workflows, and shortening the path from idea to production. These principles—first-principles thinking, system design, and rapid iteration—remain at the core of my work.

I thrive at the intersection of research and production, constantly shipping, breaking, optimizing, and scaling. Whether it’s reducing inference latency, debugging training bottlenecks, or designing experimentation pipelines—I enjoy the grind.

Outside of work, I’m an avid reader, a Kaggle and hackathon winner, and a firm believer in open-source. I’ve contributed to various projects and created two Python packages: colab-everything and torchserve-python, both designed to simplify workflows and empower the ML community. Someday, I hope to share my journey and tools at a Python Conference.

Acheivements

GenAI Skills

Kernel Development

85%

Triton, CUDA, Unsloth, Liger Kernel, torch.compile

LLM Training

90%

Pytorch, Transformers, Accelerate, Torchtune

LLM Inference

85%

vLLM, SGlang, production-stack, llm-d

LLM Application

85%

LiteLLM, Llama-stack, Smolagents, MCP

Software Engineering

Programming

90%

Python, JavaScript, OOPs, Design Patterns

DevOps & Cloud

85%

Git, GitHub Actions, Docker, K8s, AWS, GCP

Data Engineering

80%

Postgres, MongoDB, Redis, PySpark, Snowflake

API Development

85%

FastAPI, Flask, Supabase

AI Solutions Engineer, Intel

May, 2023 — Present

Develop, benchmark, profile, and optimize various AI workloads on Intel hardware using Intel's optimization stack. This includes workloads such as distributed training, QLoRA, quantization, custom PyTorch kernels, and LLM deployment. Additionally, responsible for migrating CUDA samples to SYCL and helping maintain over 30 AI code samples.

Research Assistant, SJSU Research Foundation

September, 2022 — May, 2024

Optimized the YOLOv8 model for object detection and segmentation to enable real-time inference on NVIDIA Jetson devices connected through ROS. Contributed to developing innovative strategies for Knowledge Distillation and Federated Learning, with a focus on regression problems. Additionally, worked on multi-modal modeling, including image and audio tokenization to prepare these modalities for input to LLMs, and using LLMs to generate image and audio during inference.

Machine Learning (ML) Lead, Zoop.One, Pune

September, 2021 — July, 2022

Led the strategic development of machine learning initiatives by launching four core ML services, which include over 20 deep learning models and handle more than 2 million requests per month—all within just ten months. Established best practices for MLOps, developed micro-frameworks, and implemented automation processes, with emphasis on agility and an exceptional developer experience.

Co-founder / CEO, AI Adventures, Pune

August, 2018 — September, 2021

Led client project development, overseeing the entire lifecycle and collaborating closely with clients. Created five comprehensive courses covering Python, Data Science, Machine Learning, Computer Vision, and Deep Learning. Established AI clubs in several colleges across the city, fostering a vibrant community through workshops and interactive sessions.