
Nikhil Bindal
Full-Stack & AI Infrastructure Engineer
Operations Log
Operations Log
Career history rendered as classified mission files. Toggle to view credentials.
AI Consultant / AI Infrastructure Engineer
Neurologica · Stealth AI Startup
- Architected a production multi-agent AI coaching platform on GCP Cloud Run combining biometric/HCI signal processing with a persistent personal-memory system, owning system design end to end.
- Cut end-to-end AI latency 73% (36s → 9.5s) by re-architecting a 6-agent pipeline into a coordinator-orchestrated DAG with prompt consolidation, model tiering, and async parallel execution.
- Designed "Mnemosyne," a context-scoped memory architecture with semantic retrieval and a temporal-decay lifecycle, bucketed per user and domain so context can never cross boundaries.
- Engineered Redis-backed concurrency control and session orchestration that prevented state corruption across concurrent real-time voice sessions, plus a parallel voice sidecar (Gemini Live) running sentiment/query/analytics off the hot path.
- Shipped Python and JavaScript SDKs (32 operations) spanning orchestration, memory, analytics, sessions, and BYOD multi-tenant infrastructure, with CI/CD, audit logging, and webhook delivery.
- Built "Donna," an AI meeting-intelligence platform, end to end (0→1) automating pre-meeting research, live in-meeting assistance, and post-meeting synthesis, and released it for real users.
- Designed a 3-phase architecture orchestrating 15 specialized agents behind a uniform agent contract that made parallel orchestration and fan-out trivial to extend.
- Built hybrid contextual RAG over Qdrant combining vector and keyword retrieval with reciprocal-rank fusion, improving retrieval relevance ~40% over a naive dense baseline (measured, not estimated).
- Implemented a real-time voice pipeline (LiveKit WebRTC + Deepgram STT + Cartesia TTS) targeting sub-200ms perceived latency, with a replica-independent WebSocket fan-out over Redis pub/sub.
- Built "RecoMe," a personal interest-graph & agentic recommendation engine — a capture → signal → graph → agent → surface pipeline turning cross-platform activity (15+ sources) into a typed Neo4j interest graph plus Qdrant per-user vectors, with recommendations streamed to the client over SSE.
- Orchestrated 5 background agents behind a single Guardian gate enforcing a hard $0.10/user/day LLM cost cap, throttling, and quiet hours, with a prompt-injection-resistant scorer (the LLM writes prose only, never the verdict); ran on BullMQ workers with idempotent Stripe metering and a dual consumer + multi-tenant surface on one backend (TypeScript, Hono, Prisma).
Full-Stack & AI Engineer
Northeastern University — Minkara Computational Lab
- Built a semantic-search & visualization platform for biomedical research (FastAPI, PostgreSQL + pgvector, React, D3.js) letting researchers search 10K+ papers and molecular datasets by meaning via embeddings and vector search.
- Built AWS Batch distributed simulation pipelines orchestrating thousands of parallel molecular computations and cutting compute cost ~40% via spot instances, right-sizing, and scale-to-zero with checkpoint/restart.
- Owned backend for scientific data ingestion, metadata APIs, and ML-inference workflows serving models as isolated services so heavy inference never blocked the API.
- Designed accessibility tooling for visually impaired researchers — tactile-graphics generation from molecular coordinates and screen-reader/sonification integrations, validated with blind users including the lab PI.
Software Engineer
Times Internet — TOI+ subscription platform
- Built backend for TOI+ serving ~8.4M daily requests — Node/Express subscription & paywall services with JWT auth and Redis-cached entitlements (sub-ms checks), keeping origin load low by offloading cacheable content to the Akamai edge.
- Designed a Verdaccio-based micro-frontend system packaging shared UI as versioned internal npm widgets, so 70+ city portals consumed updates by version bump instead of duplicated code — publish cadence decoupled from consume cadence.
- Migrated 70+ legacy XML/XSLT portals to React micro-frontends via a strangler-fig rollout (old + new in parallel, per-portal feature flags, stable API contracts), reaching ~92/100 Lighthouse with instant rollback.
- Built the real-time Kafka event pipeline feeding the Signals personalization engine (behavior events → user feature profiles, ~90s refresh), contributing to a measured ~9% CTR lift on recommendations.
Founding Software Engineer
Progcap — collateral-free SMB lending fintech
- Owned the underwriting & transaction backbone of a collateral-free lending platform — Node/Express microservices on an event-driven Kafka backbone, with PostgreSQL as the transactional system of record (+ append-only ledger) and MongoDB for high-write capture.
- Cut decision latency 90% (8.7s → 890ms) by parallelizing serial KYC/fraud/credit-score checks, trimming the hot path, and adding compound indexes plus Redis caching with connection pooling.
- Integrated XGBoost credit scoring on alternative data as an isolated service with timeouts and rule-based fallback, reducing false negatives ~19% (model + rules, measured on repayment cohorts).
- Engineered effectively-once disbursement via idempotency — guarded atomic state transitions, idempotency-keyed bank/NPCI calls, and partitioned Kafka consumers; load-tested to ~22K events/sec at the event tier.
- Built a Python/Celery service for feature assembly, batch jobs, and reconciliation against the append-only ledger as the source of truth.
Software Engineer
LiveMedia / LiveChek — insurance telematics
- Built backend services and APIs for real-time telematics and behavioral analytics processing driving-behavior and user-interaction streams for insurance workflows.
- Contributed to early-stage product and architecture decisions in a fast-moving startup environment.
Doctrine
Doctrine
Six operating principles drawn from staff-level engineering and founding-team experience — the rules I default to when there's no playbook.
Production-First
I build with the assumption a system will be paged at 3am. Observability, idempotency, and graceful degradation are designed in — not bolted on after the first incident.
AI as Leverage
AI is not a replacement for engineering judgment — it's leverage. I build agentic systems that compound human decisions and keep a human in the loop on every escape hatch.
Ship, Then Polish
I optimize for measurable outcomes over premature abstractions. Get a working version into production, instrument it, then iterate on what the data actually shows — not what the design doc predicted.
Ownership Over Tasks
I own outcomes, not assignments. If something blocks the goal — a broken test in another service, an unclear product spec, a vendor outage — it becomes my problem until it's resolved or properly escalated.
Strong Opinions, Loose Grip
I form sharp views early and argue for them with evidence. The moment better evidence shows up, I drop the position cleanly. Disagree hard, commit fast — politics costs more than humility ever does.
Force Multiplier
My code is only half the job. Clear APIs, sharp docs, sturdy tests, and unblocking teammates compound my output beyond what any individual contributor can ship — that's how staff-level impact actually shows up.
