VINEETSISTA
Low-latency systems · Quantitative development · AI products
I build systems that have to be fast — order books, ML pipelines, and AI products — and I care about the nanoseconds.
I'm Vineet — a CS Honors student at Ohio State who got a little obsessed with one question: how fast can a thing actually go?
Most days I'm somewhere between a C++ order book that argues about nanoseconds, a research lab teaching language models to explain themselves, and an AI product that has to ship intelligence before the market opens at 7:30.
I like problems that are equal parts fast, correct, and real. I build things that have to hold up when it counts — and I don't love waiting.
Let's build something that has to be fast.
- interningJPMorganChase · Software Engineering
- buildingA nanosecond C++ order book + an LLM inference engine
- leadingAWS Cloud Club @ Ohio State
- researchingExplainable medicine & clinical ML
- reading the tapeNASDAQ ITCH 5.0
- based inColumbus, OH · from Naperville, IL
Vineet sits at the intersection of low-latency systems, quantitative finance, and AI products. The same instinct runs through everything — from a C++ order-book matching engine that cares about nanoseconds, to ML research probing how language models make clinical decisions, to shipping AI products that deliver intelligence on a deadline. He builds things that have to be fast, correct, and real.
Engineering Scholar — selected as 1 of 96 students for a competitive program focused on innovation and hands-on engineering projects.
Experience
Eight positions, newest first — a trade history of where the work has been. Rows with detail expand.
- Leads cloud architecture across the club’s AWS projects — compute, storage, data, security.
- Mentors peers on system design: when to reach for which service, and why.
Instruments
Each project as a tradable instrument — ticker, thesis, live spec, and stack. Two flagship systems lead: a C++ order book and a from-scratch LLM inference engine.
The Order Book Engine
A low-latency NASDAQ ITCH 5.0 limit order book engine in C++20 — plus a queue-position-aware market-replay backtester and a market maker with PnL / adverse-selection analytics. The flagship, in active development.
Fire a market order and watch it walk the book — consuming resting size at price-time priority, just like the C++ core.
How the hot path got to 85 nanoseconds
miniVLLM
The second flagship — a from-scratch, high-performance LLM inference engine. Systems engineering all the way down to the GPU: paged KV cache, continuous batching, speculative decoding, a custom Triton kernel, and an OpenAI-compatible streaming server.
Explainable Medicine & Clinical ML
At OSU's BMBL and AIMed labs — probing how language models make clinical decisions, and mapping where urgent-care conditions cluster in latent space.
Built an explainable-medicine workflow generating heatmaps by probing LLMs with targeted token removals — surfacing the features most predictive of clinical decision-making and reducing hallucinations. Trained a sparse autoencoder and used UMAP to visualize how urgent-care conditions cluster in latent space.
Technical Arsenal
The datasheet — grouped by subsystem, the way a device spec or risk sheet reads.
Terminal
A real shell. Type a command — `help` to start. Arrow keys for history, Tab to complete.
The terminal is open.
Building something fast, correct, and real — or hiring someone who cares about the nanoseconds? Let’s talk.