LLM-Powered B2C AI Companion Applications (Stealth Product)

Overview

At a stealth LLM venture, I was a core AI engineer on a global B2C product that scaled to 1M+ users, shaping & implementing the design and development of real-time, LLM-powered companion features.

Developed scalable infrastructure to support 2,000+ concurrent users and 20 LLM calls/sec, implementing custom workarounds on open-source LLMs during the performance-limited Llama 2 era (2023).

Key Responsibilities & Achievements

Product Collaboration: Shaping the features, oppotunity and implementation on the product.
LLM Pipelines: Developed complex LLM pipelines for interactive, real-time chat features and companion applications. This included creating innovative workarounds for the limitations of available open-source LLMs to meet product requirements.
LLM Deployment: From serverless to self-hosting to self-optimization to using an external inference provider, reduce the cost by 10x.
Scalability: Designed systems to support 2,000+ concurrent users.

Impact

Contributed to the successful launch and scaling of a global B2C product with innovative LLM-powered features, serving more than 1M users, and 2,000+ concurrent users while lower the cost at 10x times.