Junior AI Engineer  ·  London, UK

Raian Khan

IT Engineer at Odgers

Building production AI systems — fine-tuned LLMs, RAG pipelines, and cloud-native inference platforms — with the infrastructure depth to ship them.

Raian Khan

About

Infrastructure depth.
AI ambition.

I'm a Junior AI Engineer at Odgers in London, building production AI systems including fine-tuned LLMs, RAG pipelines, full-stack AI applications, and cloud-native ML inference platforms. My background in enterprise infrastructure gives me a deployment edge most AI candidates lack.

Day-to-day I support a hybrid AWS/Azure/M365 environment for 800+ users across EMEA, APAC, and AMER — and I've shipped an LLM-based triage system at work that drove a ~35% efficiency improvement.

Currently pursuing an MSc in Artificial Intelligence at the University of Liverpool (2026–2028) and working towards AWS Machine Learning Associate certification.

800+
Users across EMEA, APAC & AMER
4
Production AI projects
~35%
Efficiency gains via automation
MSc
AI — University of Liverpool

Projects

Things I've built.

Production AI systems, ML infrastructure, and full-stack applications — built end-to-end.

📊

FinLit AI

Full-stack personal finance app ingesting Monzo/Starling bank data via CSV with auto-categorisation across 20+ spending categories. Privacy-first architecture: raw transactions processed locally, only aggregated summaries sent to Claude Haiku for insight generation (~£0.11/user/month). Includes a compound growth simulator modelling opportunity cost of spending over 1–30 year horizons.

Python FastAPI Next.js Claude API PostgreSQL Docker
🧠

Ticket Triage LLM

Fine-tuned Qwen2.5-0.5B-Instruct using LoRA adapters on ~800 synthetic samples for structured IT ticket classification. Deployed as a FastAPI microservice achieving ~95% schema-valid JSON output compliance. Born from a real operational bottleneck at Odgers — the production version drove a ~35% improvement in ticket handling efficiency.

PyTorch LoRA / PEFT Qwen2.5 FastAPI HuggingFace
🔍

RAG API Platform

A retrieval-augmented generation backend built on 384-dim SentenceTransformer embeddings, IVFFlat-indexed pgvector store, and cosine similarity top-k retrieval hitting sub-100ms latency. Fully documented REST API with typed request/response contracts, ready to be dropped behind any LLM front-end.

FastAPI PostgreSQL pgvector SentenceTransformers Python
☁️

Cloud-Native ML Inference Platform In Progress

Production ML inference API on AWS ECS Fargate behind an ALB, with full infrastructure-as-code via Terraform and a CI/CD pipeline through GitHub Actions using AWS OIDC keyless auth. Designed for zero-downtime deployments with automatic container image promotion through ECR.

AWS ECS Fargate Terraform GitHub Actions ECR ALB

Skills

Tech stack.

The tools and technologies I work with.

AI / ML

PyTorch LoRA / PEFT HuggingFace Transformers SentenceTransformers RAG Pipelines LLM Fine-tuning scikit-learn pgvector

Backend & APIs

Python FastAPI Next.js PostgreSQL Docker Terraform GitHub Actions PowerShell

Cloud & Identity

AWS (ECS Fargate, ECR, ALB, IAM, EC2) Azure / Entra ID M365 Intune Okta

Currently studying

MSc Artificial Intelligence — University of Liverpool AWS Machine Learning Associate

Contact

Let's build something.

Open to new opportunities, collaborations, and interesting conversations. Drop me a line.

contact@raian.uk