Hands On Tutorials to Implement Concepts

The best engineers have a strong appreciation for the pros and cons of a solution.
With each exercise below we train the reflection muscle to further connect the dots.
Data Flow (completed)
-
DF001: ETL vs. ELTL: Compared single-step ETL vs. multi-stage ELTL for tradeoffs in simplicity, scalability, and raw data retention.
-
DF002: Scheduling & Pipelines: Built orchestrated Dataflows w/advanced scheduling & dependency control, optimizing refresh strategies.
-
DF003: Data Type Conversion: Standardized inconsistent CSV datatypes & generated optimized Warehouse tables, assess schema tradeoffs.
-
DF004: Data Cleansing: Transformed messy audiobook dataset into an analyst-ready model, balancing complexity with performance.
Data Pipelines (in progress)
-
DP001: Simple End-to-End Pipeline: Orchestrated HTTP → Warehouse → Semantic Model with success-gated refresh.
-
DP002: IoT Project: Built Bronze/Silver tables; applied Dataflow & Notebook transforms; added lineage + Fahrenheit conversion.
-
DP003: Metadata-Driven Pipelines: Scaled to 4 datasets using For Each loops and dynamic content.
-
DP004: Data Structures (Arrays): Learned to parse JSON arrays for dynamic pipeline control, including iteration over multiple sources.
-
DP005: Parameter Play: Implemented end-to-end parameterization across parent/child pipelines, notebooks, and stored procedures.
-
DP006: Data Quality Testing (Task 1): Enforced column-contract validation with auto-quarantine on schema drift.
-
DP007: Data Structures (Objects): Focused on JSON objects-accessing key/value pairs and passing them into parameters/activities.
-
DP008: Data Structures (Advanced): Tackled nested arrays & objects; combined parsing patterns to handle real-world JSON payloads.
-
DP009: Data Quality Testing (Task 2): Centralized DQ rules in Warehouse metadata and processed multiple files dynamically.
Data Warehouse & T-SQL (in progress)
-
DW001: Create a Data Warehouse & Tables -Provision a Fabric DW, created schemas, and define base tables with T-SQL.
-
DW002: Insert & Query Data - Practice inserting rows and building queries with filters, sorting, and aggregation.
-
DW003: Joins & Relationships -Apply INNER, LEFT, RIGHT joins to connect datasets; reinforced relational modeling.
-
DW004: Functions & Expressions - Use string, date, and numeric functions to clean and transform data.
-
DW005: Views & Stored Procedures - Build reusable views and parameterized stored procs for consistent querying.
-
DW006: Indexing & Performance - Implemented clustered/non-clustered indexes; optimized queries for efficiency.
-
DW007: Security & Permissions - Configured roles, grants, and revokes to secure Warehouse access.
-
DW008: Advanced T-SQL -Used subqueries, CTEs, and window functions for analytical queries.
-
DW009: Transactions & Error Handling -Practiced BEGIN TRAN, commit/rollback, and TRY…CATCH for resilient code.
-
DW010: Data Import/Export - Load CSVs into DW and export results back for reporting.
-
DW011: Constraints & Keys - Enforce primary, foreign, and check constraints to ensure data integrity.
-
DW012: Normalization vs. Denormalization -Model normalized schemas for integrity and denormalized for performance.
-
DW013: Star & Snowflake Schemas - Design fact/dimension models for analytics workloads.
-
DW014: Slowly Changing Dimensions (Intro) - Compare SCD1 vs. SCD2; planned for tracking customer changes.
-
DW015: Watermarks Crash Course - Configure incremental load checkpoints to handle late-arriving data.
-
DW016: Partitioning - Split large tables for query efficiency; practice sliding-window strategies.
-
DW017: Surrogate Keys - Replace business keys with surrogate keys; reinforced SCD design.
-
DW018: CONFX Intro & Setup - Stage 12 raw CSVs in Bronze Lakehouse; defined Green (conformed) target.
-
DW019: CONFX Sessions Table - Consolidate Vegas/Stockholm session data; build reusable Sessions dimension.
-
DW020: CONFX Speakers - Create Green Speakers table; handle duplicates and multi-session attribution.
-
DW021: CONFX Venues - Merge venue data into conformed Green model; standardized location attributes.
-
DW022: CONFX Attendees & Attendance - Model attendees + session links; apply surrogate keys and integrity checks.
-
DW023: Fact Tables - Build fact table to tie sessions, attendees, and attendance measures.
-
DW024: Date Dimension - Generate reusable calendar/date dimension with hierarchies.
-
DW025: Time Dimension - Build time of day dimension for hourly/minute-level reporting.
-
DW026: CONFX SCD Type 1 (Detect Changes) - Compare Bronze vs. Green customers to identify inserts, updates, deletes.
-
DW027: CONFX SCD Type 1 (Apply Changes) - Write idempotent procs for merge logic with bulk insert/update/delete.
-
DW028: CONFX SCD Type 2 — Implement SCD2 table with valid_from/valid_to/is_current and validated history tracking with staged change sets.

💡 Takeaway: x