AI · Video
Multimodal Video Intelligence Platform
Backend & distributed systems
Backend services and processing workflows for multimodal video search, embeddings, and low-latency inference — built to scale as product scope grew.
Overview
Built backend services and distributed processing workflows for a multimodal video intelligence platform that enabled video search, embedding generation, and low-latency inference at scale.
Problem
Video understanding systems are operationally complex. They require ingestion, transcoding, distributed processing, model inference, indexing, and responsive APIs that can serve results quickly to end users. The challenge was to support these workflows in a way that remained scalable, observable, and reliable as product capabilities expanded.
Architecture
The platform combined API services, event-driven workers, cloud-based media processing, inference workflows, and user-facing interfaces. I worked on backend services in Go and Node.js, asynchronous workflows using queues and orchestration tools, and integrations that connected ML outputs to frontend product experiences.
My role
I designed and implemented APIs for video search, embeddings, and multimodal model features. I also contributed to backend architecture, distributed processing workflows, observability, and modernization of legacy systems into containerized services with more reliable deployment pipelines.
Technologies
Work spanned strongly typed API surfaces, containerized compute on AWS, event-driven choreographies for media and ML workloads, cache and queues for backpressure, FFmpeg-backed processing, and React/Next.js experiences consuming the same contracts.
Impact
- Improved system throughput by 45%
- Achieved sub-second inference latency in key workflows
- Reduced MTTR by more than 30%
- Helped evolve prototype systems into more scalable production services