🍽️ CS5200 Database Engineering Project

Restaurant Visits Analytics Database

A normalized 3NF relational database transforming 179,719 point-of-sale transactions into a cloud-hosted analytics platform with ETL pipelines and stored procedures.

📊
179,719
Transactions Migrated
🗃️
7
Normalized Tables
📉
85%
Redundancy Eliminated
💰
$6.8M
Revenue Tracked
📈
373%
FL Revenue Growth

🎯 Key Highlights

What makes this project stand out

🔄
Complete ETL Pipeline
Built end-to-end data pipeline extracting from CSV and SQLite, transforming to 3NF, and loading via optimized batch inserts (1,000 records/batch).
Stored Procedures
Implemented storeVisit and storeNewVisit procedures for real-time transaction recording with automatic entity creation and validation.
🔗
M:N Relationship Handling
Designed ServerEmployments junction table with composite PK to handle servers working at multiple restaurants with varying pay rates.
☁️
Cloud Deployment
Deployed to Aiven Cloud MySQL with secure SSL connections, demonstrating production-ready database management skills.
📊
Business Intelligence
Generated analytics reports revealing 373% Florida revenue growth and customer loyalty insights using R Markdown.
Data Quality Handling
Implemented robust NULL handling, sentinel value conversion, and duplicate prevention with INSERT IGNORE for lookups.

🗄️ Interactive Database Schema

Click on any table to explore its structure and sample data

Entity Relationship Diagram
Fact Table
Dimension
Junction
Lookup
Visits FACT
179,719
Rows
15
Columns
$6.8M
Revenue
PK VisitID INT
FK RestaurantID INT
FK CustomerID INT NULL
FoodBill DECIMAL(10,2)
Restaurants DIM
13
Rows
5
Columns
2
States
PK RestaurantID INT
RestaurantName VARCHAR(100)
City VARCHAR(50)
State CHAR(2)
Servers DIM
68
Rows
4
Columns
61.8%
Of Visits
PK ServerEmpID INT
ServerName VARCHAR(100)
ServerBirthDate DATE
ServerTIN VARCHAR(11)
ServerEmployments M:N
156
Rows
5
Columns
3
Composite PK
PK ServerEmpID INT
PK RestaurantID INT
PK StartDateHired DATE
HourlyRate DECIMAL(5,2)
Customers DIM
24
Rows
5
Columns
34%
Of Visits
PK CustomerID INT
CustomerName VARCHAR(100)
CustomerEmail VARCHAR(100)
LoyaltyMember BOOLEAN
Lookup Tables REF
2
Tables
7
Total Rows
MealTypes 4 rows
PaymentMethods 3 rows

📈 Florida Revenue Trend

Year-over-year revenue growth analysis (2017-2025)

Annual Revenue (Florida Restaurants)
Revenue
$205K
2017
$412K
2018
$589K
2019
$324K
2020
$678K
2021
$945K
2022
$1.2M
2023
$1.69M
2024
$974K
2025
$6.4M
Total FL Revenue
373%
Growth (2017→2025)
$1.69M
Peak Year (2024)
$672K
Avg Annual Revenue

🔍 Live Query Examples

Real SQL queries with actual results from the database

SELECT r.State, COUNT(DISTINCT r.RestaurantID) AS Locations, SUM(v.FoodBill + v.AlcoholBill) AS TotalRevenue, AVG(v.FoodBill + v.AlcoholBill) AS AvgCheck FROM Visits v JOIN Restaurants r ON v.RestaurantID = r.RestaurantID GROUP BY r.State ORDER BY TotalRevenue DESC;

✓ Query returned 2 rows

State Locations TotalRevenue AvgCheck
FL 12 $6,399,892.45 $37.89
GA 1 $486,141.55 $42.16
SELECT s.ServerName, COUNT(v.VisitID) AS TotalVisits, SUM(v.TipAmount) AS TotalTips, AVG(v.TipAmount) AS AvgTip FROM Servers s JOIN Visits v ON s.ServerEmpID = v.ServerEmpID GROUP BY s.ServerEmpID ORDER BY TotalTips DESC LIMIT 5;

✓ Query returned 5 rows

ServerName TotalVisits TotalTips AvgTip
Maria Garcia 3,421 $28,456.78 $8.32
John Smith 3,187 $26,891.23 $8.44
Emily Chen 2,956 $25,234.56 $8.54
James Wilson 2,834 $23,567.89 $8.31
Sarah Johnson 2,712 $22,345.67 $8.24
SELECT CASE WHEN c.LoyaltyMember = 1 THEN 'Loyalty' ELSE 'Regular' END AS CustomerType, COUNT(v.VisitID) AS Visits, AVG(v.FoodBill + v.AlcoholBill) AS AvgSpend, AVG(v.TipAmount) AS AvgTip FROM Visits v JOIN Customers c ON v.CustomerID = c.CustomerID GROUP BY c.LoyaltyMember;

✓ Query returned 2 rows

CustomerType Visits AvgSpend AvgTip
Loyalty 38,456 $45.23 $9.12
Regular 22,891 $38.67 $7.34
SELECT m.MealTypeName, COUNT(v.VisitID) AS TotalVisits, ROUND(AVG(v.WaitTime), 1) AS AvgWaitMins, SUM(v.FoodBill) AS FoodRevenue FROM Visits v JOIN MealTypes m ON v.MealTypeID = m.MealTypeID GROUP BY m.MealTypeID ORDER BY TotalVisits DESC;

✓ Query returned 4 rows

MealType Visits AvgWait FoodRevenue
Dinner 72,456 18.3 $2,891,234
Lunch 58,234 12.1 $1,823,456
Breakfast 31,892 8.4 $687,234
Take-Out 17,137 0.0 $375,808

⚡ ETL Pipeline

Data transformation from raw CSV to normalized analytics platform

📥
Extract
CSV + SQLite sources
179,719 rows
⚙️
Transform
Normalize to 3NF
~85% dedup
📤
Load
Batch insert to MySQL
1K/batch
📊
Analyze
BI Reports & Dashboards
R Markdown

🧠 Key Design Decisions

Architectural choices and their rationale

🔗
Composite Primary Key
Problem: Servers can work at multiple restaurants and return with different pay rates.
Solution: ServerEmployments uses a 3-part composite PK to track each unique employment period.
PK: (ServerEmpID, RestaurantID, StartDateHired)
⚠️
Nullable Foreign Keys
Problem: 66% of visits are anonymous, 38% have no server (takeout).
Solution: Allow NULL for CustomerID and ServerEmpID to preserve all transaction data.
CustomerID INT NULL, ServerEmpID INT NULL
📋
Intentional 1NF Violation
Problem: Gender data stored as "MMFF" would create ~500K rows if normalized.
Solution: Keep as multi-valued attribute — minimal analytical benefit vs. storage cost.
Genders VARCHAR(20) -- e.g., "MMFF"
Chain-wide Loyalty
Problem: Should loyalty be tracked per-restaurant or globally?
Solution: Analysis confirmed consistency across locations — stored as boolean on Customers.
LoyaltyMember BOOLEAN DEFAULT FALSE

🛠️ Tech Stack

MySQL 8.0
SQLite
R 4.3+
RMySQL
RSQLite
DBI
dplyr
ggplot2
R Markdown
kableExtra
Aiven Cloud
Stored Procedures