Building Production AI Systems
LLM Applications, RAG Pipelines & Document Intelligence — Hong Kong · Remote-Friendly
Most businesses I work with are not short of data. They are short of a system that turns that data into something actionable without requiring someone to manually read through it every day. That is the gap I build for.
Background
I am an independent AI engineer based in Hong Kong and mainland China, available for remote engagements globally. My background spans 7+ years in data and analytics roles, 2+ years building and deploying production AI systems, and a Master's in Business Analytics and AI from Vlerick Business School (Belgium).
The systems I build are end-to-end: data ingestion, LLM processing, retrieval architecture, delivery layer, and infrastructure. Not prototypes. Production systems with real users and real operational requirements.
What I Build
RAG Pipelines and Document Q&A
Organisations that handle large volumes of documents — policies, reports, contracts, filings — typically face the same bottleneck: the information exists but is not queryable. Finding a specific detail means opening files manually.
A production RAG pipeline changes this. Documents are ingested, chunked, embedded, and stored in a vector database. The result is a system where a natural language question returns a precise answer drawn from the full document archive — regardless of volume or age of the files.
Scheduled Data Pipelines and Automated Summaries
The same underlying architecture — scheduled ingestion, LLM processing, structured output — applies to monitoring and summarising information sources on a recurring basis. Configurable inputs, relevance filtering, and delivery via Telegram or email. The pipeline runs on a schedule; the output arrives without manual effort.
This is applicable wherever a team currently has someone manually reading sources and summarising them.
Memory-Enabled AI Systems
I also build AI applications where context persistence matters — systems that remember user history, learn from interactions, and retrieve relevant prior context at inference time.
HKSoka (hksoka.com/en) is a Claude-powered chat platform I designed and built end-to-end. Its core differentiator is a multi-layer RAG memory architecture:
This architecture is directly applicable to any business context where continuity across sessions matters — client relationship management, ongoing advisory workflows, or knowledge base assistants.
Infrastructure and Deployment
Production systems require more than good model calls. The infrastructure I build includes:
I have also built and maintained a production ML trading pipeline with automated signal generation, multi-seed validation, and live deployment on AWS — including full infrastructure ownership from data ingestion to model deployment.
Engagement Model
I work on a project basis. Deliverables are scoped clearly before commencement. I deliver working systems, not slide decks.
For businesses looking to automate document workflows, build internal knowledge bases, or deploy LLM-powered tools into existing operations — I am available for scoping calls.
Contact: smartai.hk+ai.consulting@proton.me
LinkedIn: linkedin.com/in/levi-innovation
Levi is an independent AI engineer based in Hong Kong. He builds production LLM applications, RAG pipelines, and document intelligence systems for businesses across financial services and professional services.
Get in Touch →