Projects

SUMMIR ECIR 2026 Main Track NER TF-IDF PPO Halluc. Score 7,900 articles | 280K insights | 4 sports

SUMMIR (ECIR 2026)

Hallucination aware framework for ranking LLM generated sports insights. 6 feature ScoreNet with PPO trained LLaMA reward models. Accepted at ECIR 2026 main track. Collaboration with Microsoft.

PPO LLaMA NLP Microsoft Published
pytest GPU assert_close @dtypes @shapes

gpucheck

pytest for GPU kernels. Dtype-aware assertions, shape fuzzing, and CUDA benchmarking. Found 8 real bugs in Triton including an 83% error in layer norm.

PyTorch CUDA Triton Published on PyPI
Adversarial Image x SIMD Sanitize <15ms Clean Image

pixmask

Sub-15ms adversarial image sanitization for multimodal LLMs. C++17 SIMD core with AVX2/NEON dispatch, zero heap allocations in the hot path. pip install ready.

C++17 SIMD Security Python Bindings
Sparse Attention Benchmark HASTE v2 Quest+SVD BLASST Interval SparQ Top-k Dense

Efficient ML Inference

Benchmarking suite comparing three sparse attention methods as drop-in Llama replacements. Evaluated on MATH500, AIME, GPQA accuracy and latency from 1K to 128K tokens.

Llama Sparse Attention Benchmarking
Kepler/TESS AI CNN + RF 99.8% acc

Project Rosetta

Explainable AI for exoplanet detection from Kepler/TESS light curves. CNN achieves 99.8% accuracy with per-prediction feature attribution. NASA Space Apps 2025.

TensorFlow React NASA Space Apps
veridex autonomous fact-checker Search Extract Score Report zero API keys required

veridex

Autonomous OSINT agent that fact-checks the internet in real time. NLP credibility scoring, multi-source synthesis, full Streamlit dashboard. Zero API keys.

spaCy NLP Streamlit Docker
PDF instruments chemicals conditions parameters

lablens

Turn any scientific paper into structured, searchable experiment metadata in under a second. NER + 300 domain-specific regex patterns across 8 entity categories.

spaCy NER BioSchemas Docker
Terrain Recognition System

Terrain Recognition

92.4% accuracy terrain classifier compressed from 150MB to 15.6MB. Compared 7 CNN architectures and 4 transfer learning approaches across 5 terrain classes. 21 stars.

EfficientNet CNNs SIH 2023 Open Source
Malicious URL Detection

ThreatX: Malicious URL Detector

BERT + MLP dual-model phishing detection with a Chrome extension. Real-time URL scanning via Flask API. Three training iterations with progressive feature engineering.

BERT Chrome Extension 1st Runner Up @ CRISIL
healthpulse 11 vital metrics, 100% offline

HealthPulse AI

100% offline health risk intelligence. Ensemble ML (RF+LR) over 11 vital metrics with 7-day rolling analysis, PDF reports, and 6 Plotly visualization tabs. Zero cloud.

scikit-learn Streamlit Plotly Docker