About Me

Senior Data Scientist specializing in GenAI & Traditional ML at Citi

My Journey

Hi, I'm Haoyang Han, and this website represents a culmination of my 5-year journey as a Senior Data Scientist at Citi. My expertise spans from traditional NLP—where I fine-tuned BERT-like models for specific tasks—to the cutting-edge GenAI domain, where I now focus on prompt engineering, model selection, and evaluation.

This knowledge hub contains essential insights from my RAG implementation experience and key mathematical foundations that I believe every data scientist should master in the GenAI era.

My methodology involves building comprehensive content skeletons and using sophisticated prompts to generate detailed technical documentation. You can explore my approach and other projects on GitHub .

📄 Professional Background:

For a detailed overview of my experience, skills, and achievements, check out my complete resume.

What You'll Discover

Deep Technical Insights

Mathematical foundations, architecture deep-dives, and implementation details that bridge theory with practice in modern AI systems.

Production-Ready Code

Real-world implementations, best practices, and hands-on guides for building scalable RAG systems and AI applications.

Strategic Perspectives

Business applications, evaluation frameworks, and insights from deploying AI systems in enterprise environments.

Curated Resources

Essential papers, tools, and methodologies for continuous learning in the rapidly evolving AI landscape.

Knowledge Areas

RAG Implementation Hub

End-to-end journey from business objectives to production deployment. Real engineering decisions, performance optimizations, and lessons learned.

Explore RAG Knowledge →

Data Science Foundations

Mathematical foundations, statistical theory, and core ML concepts. From attention mechanisms to traditional ML algorithms.

Explore Foundations →

Technical Implementation

This knowledge hub is built with Next.js 14, featuring advanced markdown processing with LaTeX equations, syntax highlighting, and interactive elements. All content is version-controlled and continuously updated with the latest insights.