AI Engineer | Generative AI | LLM Systems

I build reliable, production-grade AI systems using LLMs — focused on accuracy, evaluation, and real-world impact.

Hero illustration of devices showcasing Akanksha Raghav's skills and projects

Hi, I’m Akanksha. Great to meet you.

I’m an AI Engineer with 3+ years of experience building real-world Generative AI systems and scalable backend architectures. My work goes beyond basic LLM integrations — I focus on making AI systems reliable, measurable, and production-ready. From prompt engineering and RAG pipelines to fine-tuning and evaluation frameworks, I design systems that handle real-world complexity, edge cases, and scale.

AI Engineer

I design and build intelligent systems powered by LLMs, focusing on performance, reliability, and scalability.

Things I enjoy:

Prompt Engineering

What I work on:

CoT
ReAct
structured prompting
RAGAS
Hallucination reduction strategies
Model benchmarking
GPT
Gemini
Design Systems

RAG & LLM Systems

I build retrieval-augmented systems that combine knowledge + reasoning.

What I build:

RAG pipelines, Vector search systems

What I work on:

RAG
Azure Search
Fine-tuning
Context-aware AI assistants
OpenAI
LangChain
Chunking
Semantic Analysis
LLM Integration

Backend + AI Systems

I bring AI into production using scalable backend systems.

Experiences I draw from:

Gen-AI, LLM, Backend

Tech Stack:

Java Spring Boot
Grafana
Microservices architecture
Python
AWS(Lambda, EC2)
MySQL
Community engagement
Redis
Ably

Selected AI Case Studies

Real-world problems I’ve worked on — focused on LLM reliability, evaluation, and performance. Want to discuss more? Email me.

LLM hallucination reduction — Reducing LLM Hallucination

Designed structured prompts + evaluation datasets to reduce hallucination in edge-case scenarios.

Impact: ~40% improvement in response reliability

RAG vs fine tuning — RAG vs Fine-Tuning

Built and compared RAG, fine-tuned, and hybrid systems using evaluation frameworks.

Insight: Hybrid models performed best in accuracy vs cost

LLM evaluation system — LLM Evaluation Framework

Developed evaluation pipelines using RAGAS & DeepEval to benchmark model performance.

Focus: faithfulness, accuracy, edge cases

multimodal ai system — Multimodal AI Decision System

Combined audio + speech + LLM reasoning for real-time decision making.

Result: Improved contextual understanding

prompt engineering system — Prompt Engineering System

Designed reusable prompt templates for consistent outputs across scenarios.

Outcome: Reduced variability in responses

ai experimentation — LLM Experiments & Benchmarking

Compared GPT, Gemini, and other models across real-world scenarios.

Goal: optimize cost vs accuracy trade-offs