S Akash

I am S Akash. I studied Electrical and Electronics Engineering at the Indian Institute of Technology Patna, and I am interested in research involving GPU computing and faster ML Inference techniques.

A significant portion of my recent work has been through Google Summer of Code (CERN-HSF), contributing GPU-accelerated inference to TMVA SOFIE within ROOT. I collaborated closely with Sanjiban Sengupta and Lorenzo Moneta on GPU backends (CUDA, ROCm, Alpaka) and fast, minimal-dependency C++ code generation for ML inference.

During my visit to ShanHaiWoo (Singapore), co-hosted with Ethereum Singapore Week 2025 I built FlowLink, “Crypto Payments You Can Trust”. Our team was selected as a top‑5 winner at the Ethereum Singapore, and we were invited to present during TOKEN2049 Week at the ShanHaiWoo Winners’ Showcase. FlowLink went on to win the HashKey Chain Hackathon, earning a fully sponsored invite to present at the Hong Kong Web3 Festival.

Scalable AI deployment at Unit of Measure

Worked with Martin Kjeldsen at Unit of Measure on multimodal embeddings and large‑scale product retrieval/deduplication along with sharded vector stores to serve millions of SKUs with low latency. Explored RAG Orchestration while keeping performance vs latency vs cost in consideration.

Learning‑to‑Rerank & Efficient LLM Inference at AIML Lab IIT Patna

With Dr Sriparna Saha, I explored insight re‑ranking using LLMs via Proximal Policy Optimization (PPO) for better retrieval and quality control. In collaboration with Microsoft, our work SUMMIR: A Hallucination Aware Framework for Ranking Sports Insights from LLMs ( pre-print ) has been accepted to the main track of ECIR 2026.

I’m now investigating KV‑cache methods for long‑context inference and throughput, focusing on flash attention mechanism.

Automating repetitive work at Autostep (YC S25)

Founding engineer at Autostep (YC S25), working with Aidan Pratt to find the repetitive, high-cost work hiding inside organizations and automate it. I cluster multimodal workflow data to surface what is genuinely worth automating, and work directly with customers to ship it, from a desktop app down to the AWS infrastructure provisioned with CloudFormation. I also manage the token economics of running every model in our stack.

Loading updates…

S Akash

Electrical and Electronics Engineering

Indian Institute of Technology Patna

Scalable AI deployment at Unit of Measure

Learning‑to‑Rerank & Efficient LLM Inference at AIML Lab IIT Patna

Automating repetitive work at Autostep (YC S25)

News