BLRGMT+5:3022:56:09
···GMT+017:26:09
00:00
X0.0000
Y0.0000
Not what a model outputs - how the system decides, executes, and holds under load.

Ashwin

Gupta

Role

AI Systems Engineer

Company

Coforge
Jun 2024 – Present

Optimising: Residuals • Not: Roles
Ashwin Gupta
MLOps & GenAI - IIIT BangalorePyTorch • LLMs • RAG • GCP
Scroll
About

Inference is easy. Everything around it isn't.

Honest where it matters. Available when it's hard.

Mechanical engineering by training - which meant learning to ask why a system fails before asking how to build it. AI hit in second year like a realisation, not a subject: software that understood language was a new class of thing, and I knew it would matter before anyone around me thought it would. The IISc ML lead role confirmed the direction - physics-constrained optimisation on eVTOL design under Dr. Harursampath, five projects in eight months, at the edge of what was understood.

Production changed the picture fast. At Gida and then Coforge, the same pattern emerged: the GenAI core - prompts, basic RAG, API calls - is learnable in three months. Everyone builds it. The real gap is what surrounds the model: the routing logic, the concurrency architecture, the observability that tells you what actually broke and when. That's the part nobody wants to own. That's where I went.

At Coforge on the HSBC Conversational Analytics project, I was the youngest on the team and three months in when I became the de facto integration lead - having solved in one week a problem another technology partner hadn't resolved in eight months at the same client. GIL'd threading replaced with CPU-pinned parallel instances, asyncio + uvloop across the full pipeline, cross-stack observability built before the second incident happened. 7× session capacity. $1.3M annualised savings. MTTR from ~1hr to ~10mins.

Inference as a System

Most teams ship inference as a function call. The real questions - p95 latency, 10× load, what happens when a backend goes down - are architecture questions. I answer them before the first model goes live.

Execution Under Constraints

Systems that perform in demos often don't survive production. Real constraints - latency budgets, VRAM ceilings, cost per token - are known at design time, not discovered at launch. I build the constraint model first.

Physics-Informed Scientific ML

Data-driven physics models aren't data problems - they're structure problems. Ignoring governing equations forces the model to rediscover physics from data it may never have enough of. Embedding PDEs into the objective is what makes sparse data sufficient.

What I don't do

- I don't ship AI wrappers dressed as products. Core API calls with a nice UI aren't systems.

- I don't build for its own sake. The system has to earn what it costs to run.

- I don't take off-the-shelf work. If the implementation is a Google search away, I'm not the right person.

• The Gold and the Glory •
Guinness World Record
Jul 2025
Command Centre Ops • Most Participants - Agentic AI Day 2025 Google Cloud | Hack2Skill • July 2025
$1.3M+ Annualised Savings
Jan 2026
HSBC • Coforge
Best Team Award
Nov 2025
HSBC Account • Coforge
Pat on Back - Think Customer Award
Dec 2024
Individual Delivery Excellence • Coforge
Keep It Up Award
Jun 2026
Ownership of Professional Growth & Skill Visibility • Coforge
Java Spring AI Trainer
Dec '25 – May '26
130+ Participants • 81% voted preferred trainer • NPS +50
Best Outgoing Project
Aug 2024
Mechanical Engineering • BMSCE 2023
Augment.AI, Mentor and Founder
Jan 2022
BMSCE's AI Club
42.8K Downloads • 202K Views
Present
Human Faces Kaggle Dataset
Guinness World Record
Jul 2025
Command Centre Ops • Most Participants - Agentic AI Day 2025 Google Cloud | Hack2Skill • July 2025
$1.3M+ Annualised Savings
Jan 2026
HSBC • Coforge
Best Team Award
Nov 2025
HSBC Account • Coforge
Pat on Back - Think Customer Award
Dec 2024
Individual Delivery Excellence • Coforge
Keep It Up Award
Jun 2026
Ownership of Professional Growth & Skill Visibility • Coforge
Java Spring AI Trainer
Dec '25 – May '26
130+ Participants • 81% voted preferred trainer • NPS +50
Best Outgoing Project
Aug 2024
Mechanical Engineering • BMSCE 2023
Augment.AI, Mentor and Founder
Jan 2022
BMSCE's AI Club
42.8K Downloads • 202K Views
Present
Human Faces Kaggle Dataset
Experience & Education

The trajectory.

2019
2020
2021
2022
2023
2024
2025
2026
2027
Graphic DesignerJan 2020 – Oct 2022
OutLawed
Visual identity, event collateral & social content for a teaching NGO
First audience feedback loop - iteration under zero-budget constraints
AI Product DeveloperFeb 2021 – Dec 2021
CellStrat
Early GPT-era enterprise ML - among first Indian teams shipping
NLP pipelines: document classification & processing for enterprise clients
Full lifecycle: curation → training → eval → deploy → client handoff
Head of Machine LearningJan 2022 – Sep 2022
IISc - NMCAD Lab
eVTOL aerodynamic & structural optimisation under Prof. Harursampath
Physics-constrained surrogate ML to reduce FEM simulation cost • 5 projects across fluid, structural & thermal domains
Stable convergence with a fraction of the labelled data required by classical simulation
Data ScientistJan 2023 – May 2024
Gida Technologies
Here.app (HDFC ERGO) - 163-lang multilingual RAG • 97% factual accuracy
Prismforce Skill Graph - +30% relevance • sub-50ms on NVIDIA T4
Laminar / Metamorph / Polymorph - AI CMS • no-code chatbots • cURL→20+ lang API
AI EngineerJun 2024 – Present
Coforge
Conversational Analytics (HSBC) - SBC→STT→LLM on GCP/RHEL • authored LLD + orchestration architecture
GIL fix: CPU-pinned procs + asyncio/uvloop • 20→140–160 sessions/VM • 1,600+ concurrent
Packer GCE automation • GCP log correlator: 250K lines <5s • MTTR 1–2hr→~10min
Compute: $118K→$8K/month (~$1.3M/yr) • Azure infra intelligence • Amex GBT RAG
Best Team Award - HSBC Account
Pat on Back - Think Customer • individual delivery innovation & excellence
Keep It Up Award • ownership of professional growth & skill visibility
Java Spring AI trainer • 130+ participants • 81% voted-preferred • NPS +50
B.E. Mechanical EngineeringAug 2019 – May 2023
BMS College of Engineering
Founder & mentor - Augment.AI, BMSCE's AI club
Sponsorship Head, UTSAV '22 - signed MoUs • raised >50% of total budget in 14 days
IEEE Joint Secretary • 75+ events • chapter ranked #2 globally • co-founded CS chapter
Best Outgoing Project '23 - PINNs across fluid, structural & thermal simulation
Published: MCQ generation via graph + LLMs - NCISCT 2022
Executive Diploma, AI & MLOct 2025 – Mar 2027
IIIT Bangalore
Dual specialisation - MLOps, GenAI & Agentic AI
Concurrent with Coforge - formalising the theory behind production systems
Structural ML • probabilistic reasoning • optimisation • MLOps at scale
Impact

Proof, not promises.

Scale & PerformanceCost & EfficiencyReliability & OpsSystems BreadthReach & LanguagesResearch & PublicationsAccuracy & Quality
Hover a point - every number is delivered, not projected.
Research & Systems

Systems that had to hold.

ashwingupta.dev - Design Handoff to Production

Shipped
PersonalPersonal
ProblemThe original portfolio claimed performance engineering while shipping 400 animated DOM nodes and a 2 MB JPEG hero - self-defeating on load.
SystemRebuilt as a three-layer spatial interface - all visual effects collapsed into a single Canvas RAF loop, offscreen pre-rendering, lazy loading, WebP preloads. Structural optimization, not cosmetic.
DesignExtended with an ambient HUD system - geolocation-to-nearest-airport clocks, scroll-depth exploration tracker with color-staged progress arc, normalized mouse XY - all persistent across Astro View Transitions.
Outcome90% image reduction72% JS cut400 DOM nodes eliminated • frame time 18–25ms → 4–6ms17 pages tracked at scroll-depth resolution.

PageIndexOllama - Local-First Fork of PageIndex

Shipped
Open SourceOpen Source
ProblemTree-RAG was hardwired to one provider contract - completion differences silently corrupted recursive traversal; failures surface only at collapse.
SystemAdded a provider-routing layer with finish-reason normalization, so traversal depends on stable internal contracts, not whichever runtime answered.
DesignPrompt externalization, bounded concurrency, and hierarchical fallback stabilize long-document runs on local models with uneven outputs and limited memory.
OutcomeFully offline tree-RAG across Ollama, llama.cpp, and vLLM - provider switching is transparent, with no external API keys required.

Azure Infrastructure Documentation Engine

Client Delivery
CoforgeCoforge
ProblemAzure docs relied on manual exports and hand-drawn diagrams - every project took 2–3 days and drifted from live state.
SystemBuilt a live-state extraction pipeline - subscription scan, topology mapping, and security config analysis auto-generate SDDs and PlantUML from live resource evidence.
DesignFew-shot prompting grounds generation in extracted inventory; guardrails reject any component without a matching live resource - fabrication blocked from governance docs.
Outcome2–3 days → ~2–3 hours104 resource groups per engagement • zero fabricated components • manual PlantUML authoring removed.

Graph-Based Skill Recommendation Engine

Client Delivery
PrismforcePrismforce
ProblemSkill recommendations ignored hierarchical relationships, taxonomy changes forced full batch retraining, and live inference missed the sub-50ms SLA.
SystemBuilt a weighted directed graph over multilevel skill hierarchies with typed edges, lightweight scoring - structure, not retraining, drives relevance.
DesignDynamic node insertion and deterministic traversal keep the graph current; latency was profiled at the 99th percentile under production load.
Outcome+30% relevancesub-50ms inference on one NVIDIA T4 • taxonomy expansion no longer required batch retraining • live updates stayed current.
Recommendations

In their words.

Arun Kumar Vastrakar

Senior Delivery DirectorCoforge

"Ashwin showed a great flexibility and stretched to complete a challenging task which resulted in client's delight. He was able to code a logic which client's other partner could not do it. Well done Ashwin."

Delivery Head • HSBC AI • Pat on Back awardNov 2024

Raja Sekhar Amirapu

Senior Technical ArchitectCoforge

"Ashwin's work on the telephony ingestion layer - PJSIP-based, highly stable, low-latency SIP call-handling at scale - was technically precise. Technically strong, dependable, and proactive. He consistently delivered production-ready code with strong technical ownership, and enhanced the development workflow through automation. Highly recommended for roles in VoIP engineering, real-time media systems, or conversational-AI infrastructure."

Direct colleague • HSBC projectNov 2025

Snehasish Chakraborty

GCP Infrastructure EngineerHSBC (client)

"I had the pleasure of working with Ashwin on a highly complex GCP infrastructure setup, where his expertise in scalability, testing, and debugging proved invaluable. He played a crucial role in designing and implementing the scalability logic, ensuring our infrastructure could handle increasing workloads efficiently. His structured approach to testing helped identify potential bottlenecks early, saving us from critical failures down the line."

Client • HSBCFeb 2025

Kartik Mehta

Fraud VS Technology LeadHSBC (client)

"He would keep an open mind, welcome challenges, and think to deliver an end-to-end solution. I rate Ashwin highly - not just for his knowledge and skills, but his attitude to continue trying under pressure and deliver. I'm sure he will be adding great value wherever he works."

Client • HSBCFeb 2025

Tulsi Patro

AI EngineerGida Technologies

"Ashwin is a risk-taker, never shying away from trying innovative approaches - and what sets him apart is his ability to convert those risks into successful implementations. His commitment to meeting deadlines while upholding the quality of work is a testament to his professionalism. Fearlessness in taking on challenges inspires the entire team to push boundaries and strive for excellence."

Direct colleagueNov 2023

Stack

What I run in production.

Profiled under load. Not just imported.

Languages
Python(async · concurrency)
TypeScript
C / C++(C, C++)
SQL(PostgreSQL, MySQL)
Bash
Linux(RHEL, Arch, Ubuntu)
Backend & Systems
Distributed systems
Microservices
REST / OpenAPI
Async / event-driven(Kafka, Pub/Sub)
Concurrency & perf
Caching(Redis)
Observability(Grafana, Prometheus)
Fault tolerance
Browser automation(Selenium)
Load testing(Locust)
Profiling & Perf
Scalene
line_profiler
Memray
Real-Time & Voice
PJSIP / PJSUA2
Kamailio
Real-time systems
SIPp(SIP load testing)
Transport security(TLS · DTLS/SRTP · SDES)
Data & Infra
DB design(SQL / NoSQL)
ETL / streaming(Kafka, Spark)
FastAPI
NetworkX(graph analysis)
FAISS / ANN
Vector search(Chroma, HNSW)
Doc extraction(Camelot, Ghostscript, OpenDataLoader)
AI / ML Systems
LLM deployment
On-prem AI
RAG
Agentic(LangChain, LangGraph)
Fine-tuning(LoRA/QLoRA, Unsloth)
Eval & monitoring(W&B)
Ollama
Hugging Face
Cloud & DevOps
GCP(GCE, GKE, networking)
Azure(Resource Graph)
AWS(working)
Docker
Kubernetes
Autoscaling
Packer
Terraform
CI/CD
Contact

Hard problems welcome.

Optimising: Residuals • Not: Roles

Heads-down building right now - not looking for roles. But if you've got a hard problem, a wild idea, or just want to talk shop about LLMs, distributed systems, scientific ML, or why this site is unreasonably over-engineered for a portfolio, I'm always up for that.

🎨 Vision & design by Ashwin Gupta • ⚡ Engineered with Claude Code • 🚀 Deployed on Vercel