A star schema analytics datamart processing 1.18M streaming transactions with ETL pipelines, partitioned fact tables, and business intelligence reporting for SportsTV Germany.
What makes this project stand out
End-to-end data flow from sources to analytics
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β DATA SOURCES (EXTRACT) β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β ββββββββββββββββββββββββ ββββββββββββββββββββββββ β β β SQLite Database β β CSV Export β β β β subscribersDB β β new-streaming-txns β β β β 1,083,131 records β β 98,732 records β β β ββββββββββββ¬ββββββββββββ ββββββββββββ¬ββββββββββββ β βββββββββββββββΌβββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ β β βΌ βΌ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β ETL PIPELINE (TRANSFORM) β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β’ Batch processing (50K records/batch) β β β’ Sport inference for orphaned assets (161,588 recovered) β β β’ User β Country mapping via postal codes β β β’ Vectorized operations + hashmap lookups β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β βΌ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β STAR SCHEMA DATAMART (LOAD) β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β βββββββββββββββ βββββββββββββββββββββββββββ βββββββββββββββ β β β dim_date βββββΆβ fact_streaming_summary ββββββ dim_country β β β β 1,752 days β β (PARTITIONED) β β 4 countries β β β βββββββββββββββ βββββββββββββββββββββββββββ βββββββββββββββ β β β β MySQL 8.0 on Aiven Cloud β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Three-stage ETL process with data recovery
Pull streaming transactions from SQLite operational database and CSV exports from web/app clients.
Map users to countries, infer sports from asset prefixes, aggregate daily metrics, handle orphaned records.
Bulk insert into partitioned fact tables with ON DUPLICATE KEY UPDATE for idempotent loads.
Streaming transaction volume by year (2021-2025)
Key insights from the data warehouse
| Sport | Streams | Share |
|---|---|---|
| Ice Hockey | 687,234 | 59.9% |
| Ski Jumping | 298,451 | 26.0% |
| Inline Hockey | 161,994 | 14.1% |
| Country | Streams | Share |
|---|---|---|
| Deutschland | 687,234 | 59.9% |
| Osterreich | 245,891 | 21.4% |
| Schweiz | 156,432 | 13.6% |
| Liechtenstein | 58,122 | 5.1% |
Dimensional model optimized for analytics queries
Data flow from operational sources to analytics datamart
Architectural choices and their rationale