Platform engineer and builder. I shipped an open-source education platform to 490K+ monthly users, deployed LLMs into production at scale, and led SRE for crypto trading systems at Bakkt. Currently building AI-powered developer tools and open-source infrastructure.
Projects
Tech Stack
Languages
Platform & Infra
Observability
AI & Data
Work Experience
Blog
How PeakofEloquence.org Scaled to 490K Monthly Users
The technical story behind scaling an open-source education platform to 490K+ monthly active users across 15+ countries — edge computing, Kubernetes, and lessons from unexpected viral growth.
What I Learned Monitoring LLMs in Production for a Year
Practical lessons from deploying and monitoring production LLMs — why traditional APM fails, what metrics actually matter, and how to build observability for non-deterministic systems.
Building a Shopify MCP Server: AI Agents Meet E-Commerce
How I built a Model Context Protocol server that lets AI agents manage Shopify stores — GraphQL, tool design, and getting listed on 5+ MCP registries.
From AWS Support to AI Infrastructure: 5 Years of Building Systems
Career reflections on going from AWS Cloud Support to SRE at Bakkt to deploying production LLMs — what each stage taught me and why I went independent.
OpenRouter + Datadog Observability
A reliable engineer's guide to production-grade LLM monitoring. Set up OpenRouter broadcast integration with Datadog in 5 minutes.
About Me
Platform engineer specializing in cloud infrastructure, AI/LLM systems, and developer tooling. Former AWS and Bakkt SRE, now building open-source tools.
Connect
Open to full-time platform engineering, AI infrastructure, and SRE roles (remote preferred). Also interested in technical advisory work and open-source collaborations.
Reach me at admin@rezajafar.com


