AI - Rag & Fine Tuning | Yael Bernstein

Key Takeaway:

The session underscored that effective model customization is not about choosing between RAG and fine-tuning, but about leveraging both in hybrid.

RAG grounds models in authoritative external data, while fine-tuning adapts their behavior to domain-specific needs. Together, they create systems that are both contextually accurate and stylistically aligned.

The demo also showed how Azure integrates responsible AI guardrails — content safety filters, roundedness validation, and evaluation harnesses to ensure these hybrid solutions remain safe, reliable, and compliant at enterprise scale.

Customize AI Models with RAG & Fine-Tuning Techniques

This training gave me a blueprint for solution engineering on Azure: grounding models with RAG, adapting them with fine-tuning, and embedding responsible AI guardrails to ensure safety, transparency, and compliance.

1. Introduction: Concepts & Frameworks

RAG (Retrieval-Augmented Generation): Grounds models in external data (databases, documents, APIs) so they can answer with facts instead of hallucinations.
Fine-Tuning: Adapts a base model’s behavior (style, tone, or specialized tasks) using curated domain-specific datasets.
Hybrid Approach: The most powerful solutions combine RAG + fine-tuning — grounding answers in real data while shaping how those answers are delivered.
Dataset Design:
- Train Set → Teaches the model.
- Validation Set → Guides training step-by-step to avoid overfitting/underfitting
Test Set → Evaluates generalization on unseen data.
- Semantic Chunking: Breaks documents into meaning-rich sections, reducing hallucinations by improving retrieval quality.
- Distractors: Adding irrelevant chunks during training teaches the model to filter noise and focus on the relevant context.
- Responsible AI in Azure: Azure enforces safety by automatically checking fine-tuning jobs for harmful or biased data, rejecting unsafe runs before deployment.

2. Demo: Azure Workflow, Guardrails & Evaluation Metrics

Step 1: Dataset Prep & Upload

Structure datasets into train / validation / test JSONL files.
Upload into Azure AI Foundry for fine-tuning.

Step 2: Fine-Tuning Jobs

Submit fine-tuning jobs via Azure SDK or Python scripts.
Azure enforces pre-check guardrails → scans training data for harmful or disallowed content; unsafe jobs are automatically rejected.

Step 3: Guardrails & Responsible AI

Guardrails demonstrated included:
- Content Safety Filters → detect harmful, biased, or protected material.
- Groundedness Checks → ensure responses stay tied to retrieved data.
- Custom Harm Categories → organizations can define what to block.
- Evaluation Harnesses → small controlled datasets run repeatedly to test safety & reliability.
- These are built into the Azure workflow to enforce compliance and responsible AI practices.

Step 4: Automate & Deploy

Deployment automated with Python SDK + REST API.
Models deploy to secure managed endpoints with RBAC-based governance.

Step 5: Evaluate Results

Evaluation metrics measure quality, while guardrails ensure safety and compliance:
- Coherence: logical flow of the answer.
- Fluency: natural grammar/readability.
- Groundedness: tied to context, not hallucinated.
- Precision & Recall: accuracy vs. completeness.
- Similarity to Ground Truth: alignment with gold-standard answers.

3. RAFT & Vector Search

Vector Similarity Search: Uses embeddings to fetch semantically related information (e.g., “bat” → distinguish vision from mythology references).
RAFT (Retrieval-Augmented Fine-Tuning): A technique that combines retrieval and fine-tuning so models learn to use external data correctly while adapting to your domain.
GitHub Resource: aka.ms/raft-recipe (includes code, examples, and evaluation scripts).