Preview real modeling problems before you commit to signup.
This first content slice shows the authored problem contract in a real UI. You can read the scenario, requirements, and hints before signing in. Progress saving and answer submission still require authentication.
Concept tracks
Find every related modeling problem across packs.
Filter by resolved concept so SCD, grain, key, bridge, and cardinality problems collect in one place instead of staying buried inside separate problem sets.
Dimensional Modeling Foundations
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Choose the right SCD pattern for price history
Slowly Changing Dimension Type 2 · Quick Decision · 5 min
- 2
Model a product dimension that preserves price history
Slowly Changing Dimension Type 2 · Concept Challenge · 20 min
- 3
Model campaign enrollments with a bridge table
Many-to-Many Modeling · Concept Challenge · 20 min
- 4
Model order-line grain with a degenerate order number
Degenerate Dimensions · Concept Challenge · 20 min
- 5
Model a daily account balance snapshot fact
Fact Table Grain · Concept Challenge · 20 min
- 6
Conform customer identity across orders and support tickets
Conformed Dimensions · Concept Challenge · 25 min
- 7
Design a subscriber health model for a streaming product
Fact vs Dimension Roles · Open Scenario · 30 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
This is one of the first places dimensional modeling either preserves truth or quietly destroys it. If you choose the wrong change-tracking pattern, every downstream revenue trend becomes harder to trust.
Success signal
Choose the option that preserves historical attribute versions instead of overwriting them.
- Distinguish Type 1 from Type 2 change tracking.
- Recognize when historical attribute preservation matters.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
This is the first modeling step where students have to move from pattern recognition into an actual schema design. The challenge is not just naming a Type 2 dimension, but making the fact-to-dimension relationship support correct historical analysis.
Success signal
The model includes a dedicated product dimension rather than keeping product history on the fact.
- Model Type 2 tracking with temporal columns that define a full validity window.
- Connect fact foreign keys to a surrogate primary key in a reusable dimension.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Many-to-many relationships are a classic interview trap. A direct edge looks simple, but it hides the row grain and leaves nowhere clean to store relationship-specific facts like enrollment date or status.
Success signal
The model avoids a direct subscriber-to-campaign many-to-many relationship.
- Recognize when an associative structure is required for many-to-many analysis.
- Model a bridge entity that preserves reusable dimensions and relationship-level attributes.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Degenerate dimensions are a small pattern with big interview signal. They test whether a modeler understands fact grain well enough to keep transaction identifiers queryable without adding unnecessary joins.
Success signal
The model has a fact table representing order-line grain rather than order-header grain.
- State the row grain of an order-line fact before adding measures.
- Use a degenerate-dimension column for transaction identifiers that do not deserve their own dimension.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Periodic snapshot facts are a classic grain decision. If learners only store current account balance on a dimension, historical reporting becomes impossible; if they model each transaction instead, every balance query has to recompute state from events.
Success signal
The model has a fact table representing daily balance snapshots rather than only current account state.
- Distinguish a periodic snapshot fact from an event fact.
- Model the snapshot date as part of the fact grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Conformed dimensions are where dimensional modeling stops being a set of isolated stars and becomes an analytical system. If each process owns its own customer table, cross-process metrics look easy to build but produce inconsistent segmentation and duplicated identity logic.
Success signal
The model has one customer dimension reused by both business processes.
- Recognize when multiple fact tables should share one conformed dimension.
- Model separate business processes without duplicating dimensional identity.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Open scenarios are where modeling judgment matters more than rote correctness. This prompt forces the student to choose grains, separate attributes from events, and accept that several reasonable schemas may exist with different trade-offs.
Success signal
The design has a defensible fact-table grain for lifecycle and activity reporting.
- Translate business requirements into a multi-entity analytics model.
- Balance dimensional clarity with event-level flexibility.
Grain and Fact Types
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Choose the fact type for order-line revenue
Fact Table Grain · Quick Decision · 5 min
- 2
Choose the fact type for daily inventory on hand
Fact Table Grain · Quick Decision · 5 min
- 3
Choose the fact type for claim lifecycle milestones
Fact Table Grain · Quick Decision · 5 min
- 4
Model claim lifecycle as an accumulating snapshot
Fact Table Grain · Concept Challenge · 25 min
- 5
Avoid movement-only inventory reporting
Fact Table Grain · Concept Challenge · 30 min
- 6
Model order fulfillment as an accumulating snapshot
Fact Table Grain · Concept Challenge · 25 min
- 7
Model a recruiting pipeline lifecycle
Fact Table Grain · Concept Challenge · 25 min
- 8
Avoid event-only package workflow modeling
Fact Table Grain · Concept Challenge · 25 min
- 9
Add duration measures to a support escalation snapshot
Fact Table Grain · Concept Challenge · 20 min
- 10
Handle out-of-order underwriting milestones
Fact Table Grain · Concept Challenge · 25 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
If the model collapses revenue to order-header grain, product mix and charge-level analysis become guesses.
Success signal
The answer chooses a transaction fact rather than a snapshot or workflow row.
- Identify transaction facts when the business process is an event.
- Separate order headers from order-line measurement grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Inventory on hand is state. Movement events explain changes, but they do not directly store the end-of-day balance analysts need to trend.
Success signal
The answer chooses a periodic snapshot fact.
- Identify periodic snapshot facts for regular observations of state.
- Explain why transaction events alone do not provide every daily balance.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Workflow analysis fails when every milestone is isolated and analysts cannot see the end-to-end path for one claim.
Success signal
The answer chooses an accumulating snapshot fact.
- Identify accumulating snapshots for workflows with milestone dates.
- Explain how one row can accumulate dates as the process progresses.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Accumulating snapshots are the fact type for pipeline timing. A collection of disconnected milestone facts can answer event counts, but it makes the end-to-end workflow harder to inspect.
Success signal
The model uses one accumulating snapshot fact instead of disconnected milestone facts.
- State an accumulating snapshot grain before adding milestone columns.
- Model workflow milestone dates on one fact row.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Movement facts explain why inventory changed. They do not by themselves store the daily state that operations teams trend, reconcile, and alert on.
Success signal
The model includes a periodic snapshot fact rather than only movement transaction events.
- Choose periodic snapshot grain when the business asks for regular state.
- Protect product plus warehouse plus snapshot-date grain with a key.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Fulfillment is a pipeline. If each milestone is modeled as a disconnected event fact, operations can count events but cannot easily inspect one order line moving through the full workflow.
Success signal
The model uses one accumulating snapshot fact instead of separate milestone facts as the main answer.
- Use an accumulating snapshot for a known order-line workflow.
- Keep milestone timestamps together on one workflow row.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Hiring funnels are workflow processes. A purely event-shaped model can count stage changes, but the operations question is often how long a specific application takes to move through the pipeline.
Success signal
The model has one accumulating snapshot fact at application grain.
- Identify candidate application as the workflow grain.
- Model hiring milestones on one accumulating snapshot row.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Transaction events are useful, but they are not always the best primary answer. Pipeline reporting often needs milestone dates side by side on one workflow row.
Success signal
The main model includes an accumulating snapshot fact rather than only package status events.
- Recognize when a transaction feed is not the serving fact grain.
- Model package lifecycle milestones on an accumulating snapshot.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Milestone timestamps are necessary but not always sufficient for operational reporting. Duration measures make pipeline bottlenecks visible without every dashboard re-deriving the same lag logic.
Success signal
The model keeps one workflow row while adding duration measures.
- Add duration measures without changing the workflow grain.
- Keep ticket escalation milestones together on one accumulating snapshot.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Accumulating snapshots update as milestones arrive. Without event-time milestones and load audit context, late or out-of-order updates quietly rewrite the workflow with no way to explain what changed.
Success signal
The model preserves one workflow row per loan application.
- Separate milestone event time from load audit time.
- Preserve one loan application workflow row despite out-of-order updates.
Grain and Keys
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Model clickstream events at atomic event grain
Fact Table Grain · Concept Challenge · 20 min
- 2
Choose trip-segment grain for ride-hailing revenue
Fact Table Grain · Concept Challenge · 20 min
- 3
Model subscription billing at invoice-line grain
Degenerate Dimensions · Concept Challenge · 20 min
- 4
Choose product keys while merging two source systems
Surrogate Keys · Concept Challenge · 20 min
- 5
Protect customer identity when source ids get recycled
Surrogate Keys · Concept Challenge · 20 min
- 6
Model support-ticket escalations with an assignment bridge
Many-to-Many Modeling · Concept Challenge · 20 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Clickstream models are tempting to aggregate too early. If the fact grain is one session, analysts cannot reconstruct step-level behavior, isolate broken pages, or join experiments to the exact event where a user saw a variant.
Success signal
The model has a dedicated fact table for page events rather than session summaries.
- State an atomic event grain before adding clickstream measures.
- Use a degenerate event identifier to preserve source traceability without inventing an event dimension.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Trip-grain facts look simple until one trip has multiple economically meaningful legs. Choosing the wrong grain forces analysts to average segment-level measures or duplicate dimensions in ways that make driver payouts and marketplace margin unreliable.
Success signal
The model can represent multiple segments for one trip without overwriting fare or driver context.
- Compare trip grain with trip-segment grain against analytical requirements.
- Use a composite grain key when a parent trip contains multiple fact rows.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Billing facts often fail because the modeler chooses invoice-header grain and hides the charge-level details. Keeping invoice number as a degenerate dimension lets analysts reconcile to finance systems without inventing a dimension that has no descriptive attributes.
Success signal
The model has an invoice-line fact rather than only one row per invoice header.
- Use invoice-line grain for mixed recurring, usage, tax, and credit rows.
- Recognize invoice number as a degenerate dimension.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Natural keys feel meaningful until two systems disagree about their meaning. A merged dimension needs a stable warehouse identity while still preserving source identifiers for reconciliation and lineage.
Success signal
The model has a merged product dimension instead of separate analytical product tables per source.
- Separate warehouse surrogate identity from source-system natural keys.
- Preserve source identifiers as attributes without using them as the primary analytical key.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Recycled ids are a quiet analytics failure mode. If source_customer_id is the warehouse primary key, history for two real people can collapse into one row and contaminate retention, support, and revenue reporting.
Success signal
The customer dimension does not use recycled source_customer_id as its sole primary key.
- Use surrogate keys when source identifiers are not durable.
- Preserve source ids as lineage attributes without letting them define analytical identity.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A direct ticket-to-agent relationship answers only current ownership. Escalation history is a relationship with its own grain, timing, and attributes, so it needs a bridge or factless fact rather than a single foreign key on the ticket.
Success signal
The model can represent multiple agents for one ticket over time.
- Recognize when one-to-many ownership is not enough because relationship history has its own grain.
- Model an assignment bridge between support tickets and agents.
Keys and Late-Arriving Data
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Reject a smart key that encodes meaning
Surrogate Keys · Quick Decision · 5 min
- 2
Survive a recycled source identifier
Surrogate Keys · Quick Decision · 5 min
- 3
Handle a late-arriving dimension with an inferred member
Surrogate Keys · Concept Challenge · 25 min
- 4
Join a late-arriving fact to the right Type 2 version
Slowly Changing Dimension Type 2 · Concept Challenge · 30 min
- 5
Collapse a composite natural key into one surrogate
Surrogate Keys · Concept Challenge · 20 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Smart keys couple identity to mutable business meaning. When the encoded facts change, the key either lies or has to change — both break referential integrity. A meaningless surrogate stays stable.
Success signal
The answer chooses a meaningless surrogate key.
- Identify smart-key anti-patterns.
- Separate stable identity from descriptive, changeable attributes.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A recycled natural id silently merges two real-world entities if used as the key. A surrogate key plus a validity window keeps the histories apart.
Success signal
The answer uses a surrogate key, not the recycled device_id.
- Recognize recycled natural identifiers.
- Separate identity from a reused operational id.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Late-arriving dimensions are a load-ordering reality. An inferred member — a surrogate row created on the fly and flagged for backfill — keeps the fact loadable and referentially sound without dropping or stalling rows.
Success signal
An order can load before the customer master, referencing an inferred surrogate row.
- Create an inferred dimension member so a fact can load before its dimension.
- Mark inferred rows for later backfill.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Late-arriving facts against a Type 2 dimension fail quietly if they join to the current version. The event time must select the version whose validity window contains it.
Success signal
Each event attributes to the customer version valid when it happened.
- Use an event time to select the correct Type 2 version.
- Reference the historically-correct surrogate key, not the current row.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Composite natural keys spread across every fact and make joins verbose and error-prone. One surrogate key on the dimension keeps facts narrow while the natural columns stay available as attributes.
Success signal
Facts join on one store surrogate key, not a two-column pair.
- Collapse a composite natural key into a single surrogate.
- Keep the natural key parts as attributes for lookup.
SCD Depth
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Do not Type 2 a current-only directory
Slowly Changing Dimension Type 2 · Quick Decision · 5 min
- 2
Model full product price and category history
Slowly Changing Dimension Type 2 · Concept Challenge · 20 min
- 3
Avoid Type 2 row explosion for loyalty points
Slowly Changing Dimension Type 2 · Concept Challenge · 20 min
- 4
Split demographic buckets into a mini-dimension
Slowly Changing Dimension Type 2 · Concept Challenge · 20 min
- 5
Use Type 3 for one previous territory
Slowly Changing Dimension Type 2 · Quick Decision · 5 min
- 6
Choose SCD strategy per account attribute
Slowly Changing Dimension Type 2 · Concept Challenge · 25 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
SCD work is not maximally historical by default. Over-versioning a current-only directory adds operational cost without improving the business workflow.
Success signal
The answer chooses Type 1 overwrite for the current-state directory.
- Choose Type 1 overwrite when history is not analytically required.
- Explain why Type 2 would be waste for a current-only use case.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A Type 2 dimension is only useful when it preserves both the version identity and the validity window. Without a surrogate row key, effective dates, and current-row signaling, historical joins become fragile.
Success signal
The product dimension has a surrogate primary key for version rows.
- Model Type 2 row versioning with surrogate keys and validity windows.
- Connect facts to versioned dimension rows through explicit foreign keys.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Type 2 is not the right answer for every changing attribute. Versioning the whole customer dimension for every point change creates row explosion and makes stable customer context noisy.
Success signal
The design avoids using Type 2 customer row versions for every loyalty point change.
- Recognize when one volatile attribute should not drive Type 2 row churn.
- Model high-frequency changes as a fact at the event grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Mini-dimensions keep volatile profile buckets analyzable without bloating the base customer dimension. The model still needs a time-bound assignment grain so historical campaign analysis is reproducible.
Success signal
The base customer dimension stays stable.
- Separate stable customer identity from high-churn profile buckets.
- Use a snapshot fact to time-bound mini-dimension assignments.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Type 3 is narrow but useful: it supports one prior value without the row-versioning cost of Type 2.
Success signal
The answer chooses a previous_territory attribute on the current row.
- Choose Type 3 when only one prior value is required.
- Explain the tradeoff between Type 3 simplicity and Type 2 history depth.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Real dimensions rarely use one SCD treatment for every attribute. The modeling skill is choosing the lightest history pattern that still supports each business question.
Success signal
Account name can be overwritten as current descriptive context.
- Justify different SCD treatments for different attributes.
- Model Type 2 validity for historical plan tier and Type 3 previous territory in one dimension.
Many-to-Many and Bridge Tables
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Choose the model for shared account ownership
Many-to-Many Modeling · Quick Decision · 5 min
- 2
Choose the model for stacked promotions
Many-to-Many Modeling · Quick Decision · 5 min
- 3
Choose the model for one current account owner
Relationship Cardinality · Quick Decision · 5 min
- 4
Model shared account ownership with a bridge
Many-to-Many Modeling · Concept Challenge · 30 min
- 5
Avoid duplicating order lines for promotions
Many-to-Many Modeling · Concept Challenge · 30 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A single owner foreign key hides co-ownership and makes ARR double counting likely when credit is split across owners.
Success signal
The answer chooses an account-owner bridge.
- Identify many-to-many ownership as a relationship with its own grain.
- Explain why allocation weight belongs on the bridge row.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A direct promotion key on the order line can only store one promotion, while duplicated order lines inflate revenue.
Success signal
The answer chooses an order-line promotion bridge.
- Identify stacked promotions as a many-to-many relationship.
- Keep allocated discount on the bridge row where the promotion application happened.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Bridge tables should solve real many-to-many grain. Adding one for a stable one-owner relationship makes simple reporting harder.
Success signal
The answer chooses a direct owner foreign key or relationship.
- Distinguish direct N:1 relationships from true many-to-many relationships.
- Avoid adding bridge tables when the relationship has no independent grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Account ownership is not just an attribute when it has dates and split credit. Without bridge grain, ARR either double counts or loses historical ownership.
Success signal
The model can represent multiple owners for one account over time.
- Model many-to-many account ownership as a bridge.
- Protect bridge grain with account, owner, and effective-start keys.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Duplicating the order line for each promotion makes revenue additive in the wrong place. The promotion application needs its own bridge grain.
Success signal
The model includes a bridge instead of duplicating order-line fact rows.
- Separate order-line fact grain from promotion-application bridge grain.
- Use a bridge to represent multiple promotions on one line.
Conformed Dimensions and the Bus Matrix
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Read the bus matrix: which dimension is shared
Conformed Dimensions · Quick Decision · 5 min
- 2
Avoid duplicating a near-identical customer dimension
Conformed Dimensions · Quick Decision · 5 min
- 3
Conform a date dimension across ordering and shipping
Conformed Dimensions · Concept Challenge · 25 min
- 4
Conform a customer dimension across three processes
Conformed Dimensions · Concept Challenge · 30 min
- 5
Conform a shrunken month dimension for plan vs actual
Conformed Dimensions · Concept Challenge · 25 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
The enterprise bus matrix lists business processes as rows and dimensions as columns. A dimension shared by multiple rows is conformed; modeling it once is what makes cross-process questions answerable.
Success signal
The answer chooses the dimension common to both ordering and shipping.
- Identify which dimensions are shared across processes from a bus matrix.
- Distinguish conformed dimensions from process-specific ones.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Duplicated lookalike dimensions are the most common conformance failure. They drift apart, double-count, and make cross-process metrics impossible to trust.
Success signal
The answer consolidates to one conformed customer dimension.
- Name the conformance trap of per-process duplicate dimensions.
- Choose one conformed dimension shared through foreign keys.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Conformed dimensions are what make cross-process analysis possible. If ordering and shipping each carry their own raw dates, the calendars drift and timing comparisons stop being trustworthy.
Success signal
Both facts reference the same date dimension through foreign keys.
- Reuse one date dimension across two separate business processes.
- Connect each fact to the shared dimension through a surrogate-key foreign key.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Three-process conformance is where siloed teams usually fail: each builds its own customer table and the funnels never reconcile. One conformed customer dimension is the fix.
Success signal
All three facts reference the same customer dimension.
- Conform one customer dimension across three distinct processes.
- Model each process as its own fact at its own grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A shrunken conformed dimension is a rollup of a base dimension to a coarser grain. Both monthly facts must share it, or plan and actual land on mismatched periods and variance becomes meaningless.
Success signal
Both monthly facts reference the same month dimension.
- Use a shrunken (rollup) dimension at a coarser grain than the base calendar.
- Conform the month dimension across two monthly summary facts.
Advanced Dimensional Patterns
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Group low-cardinality flags into a junk dimension
Junk Dimensions · Concept Challenge · 25 min
- 2
Reuse one date dimension in multiple roles
Role-Playing Dimensions · Concept Challenge · 25 min
- 3
Split volatile attributes into a mini-dimension
Slowly Changing Dimension Type 2 · Concept Challenge · 30 min
- 4
When a junk dimension is the wrong tool
Junk Dimensions · Quick Decision · 5 min
- 5
Role-playing dimension or duplicate tables
Role-Playing Dimensions · Quick Decision · 5 min
- 6
Recognize a justified outrigger dimension
Fact vs Dimension Roles · Quick Decision · 5 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A junk dimension collects unrelated low-cardinality flags into one small dimension. It keeps the fact narrow and gives flags a stable home instead of a column sprawl that grows with every new indicator.
Success signal
The flags live in a junk dimension, not as separate fact columns.
- Recognize low-cardinality flags that belong in a junk dimension.
- Model a junk dimension and reference it from the fact with one surrogate key.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Role-playing dimensions reuse one physical dimension in multiple semantic roles through multiple foreign keys. Separate date tables drift apart and break conformance; multiple FKs to one date dimension keep the calendar consistent.
Success signal
One date dimension serves both the order-date and ship-date roles.
- Model multiple foreign keys from one fact to a single shared dimension.
- Name each foreign key for the role it plays (order date, ship date).
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A mini-dimension isolates rapidly changing, low-cardinality attributes so they version independently of the stable dimension. The fact references both, capturing the band at event time without row explosion.
Success signal
Volatile attributes live in a mini-dimension, not in the product dimension.
- Recognize when rapidly changing attributes should leave the base dimension.
- Model a mini-dimension of attribute bands.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Junk dimensions are for low-cardinality flags. Folding a high-cardinality attribute like email into them explodes the junk dimension to one row per distinct value, defeating the purpose.
Success signal
The answer keeps high-cardinality attributes out of the junk dimension.
- Distinguish low-cardinality flags from high-cardinality attributes.
- Choose the right home for a high-cardinality attribute.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Duplicate per-role dimensions drift apart and break conformance. Role-playing — one dimension, multiple foreign keys — keeps the calendar consistent across every role.
Success signal
The answer reuses one date dimension across the three roles.
- Prefer role-playing over duplicate dimensions for multiple date roles.
- Connect each role with its own foreign key to one shared dimension.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
An outrigger is a dimension referenced by another dimension. Used sparingly for a reused, independently maintained attribute cluster it is fine; used everywhere it snowflakes the model and hurts usability.
Success signal
The answer uses an outrigger only for the reused, independently maintained cluster.
- Recognize a justified outrigger dimension.
- Avoid reflexive snowflaking of every dimension attribute.
Metrics-first Product Modeling
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Define activation before drawing tables
Fact Table Grain · Quick Decision · 5 min
- 2
Choose atomic events over metric-shaped tables
Fact Table Grain · Quick Decision · 5 min
- 3
Model workspace invite activation from the metric
Fact Table Grain · Concept Challenge · 25 min
- 4
Interview prompt: marketplace listing activation drop
Fact Table Grain · Open Scenario · 35 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
If activation is undefined, the schema can preserve the wrong event grain and make every later metric conversation ambiguous.
Success signal
The answer defines numerator, denominator, time window, and breakdown dimensions.
- Identify the metric definition as the first modeling step.
- Connect activation definitions to event grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Metric-shaped tables make every new metric a new model. Atomic events keep the source behavior queryable and let metrics evolve.
Success signal
The answer chooses an atomic notification event fact.
- Choose event grain that supports several metrics.
- Separate event facts from serving-layer aggregates.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
The model needs the events behind the rate. If the fact starts as a user-day summary, the team cannot separate sent, accepted, expired, and resent invite behavior.
Success signal
Clarification: States activation as accepted within 7 days of sent and names acquisition channel and workspace plan as required breakdowns.
- Translate activation numerator and denominator into event-grain requirements.
- Model invite events without collapsing to user-day or workspace-day summaries.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
This is the bridge from guided concept practice to interview mode. The schema matters, but the verdict depends on whether the candidate clarifies the product question, defends grain, and explains tradeoffs.
Success signal
Clarification: Identifies actors, events, metric definitions, time windows, exclusions, and edge cases before drawing tables.
- Ask clarifying product and metric questions before modeling.
- Translate activation, conversion, and seller-mix metrics into fact grains.
Meta-style Product Modeling
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Meta-style messaging engagement model
Fact Table Grain · Concept Challenge · 25 min
- 2
Meta-style marketplace listings model
Conformed Dimensions · Concept Challenge · 25 min
- 3
Meta-style short-video watch-time model
Fact Table Grain · Concept Challenge · 25 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Meta-style modeling prompts are product conversations before they are schema exercises. If the model skips event grain or hides thread context, DAU and retention become dashboard numbers without a path to debug product behavior.
Success signal
The fact grain supports both active-user counts and event-volume metrics.
- Translate DAU and retention questions into event-grain modeling requirements.
- Model message engagement events without collapsing activity to user-day summaries.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Marketplace prompts combine lifecycle events with identity modeling. If seller status is overwritten or buyer and seller identity split into unrelated dimensions, conversion and policy-impact analysis become inconsistent.
Success signal
The model can measure listing activation and purchase conversion without mixing fact grains.
- Separate listing lifecycle events from purchase facts with different grains.
- Use a conformed customer dimension across buyer and seller analysis.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Meta-style rounds often test whether the candidate can resist metric-shaped tables. Watch time and unique viewers sound like different facts, but both can be computed from the same atomic view event when the grain and dimensional joins are right.
Success signal
The model can sum watch time by title and count unique viewers by device from the same fact.
- Choose one atomic view-event fact that supports multiple metrics.
- Model watch time as a measure and viewer/device/title context as dimensions.
Amazon-style Warehouse Modeling
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Design an order management warehouse
Fact Table Grain · Concept Challenge · 30 min
- 2
Model daily warehouse inventory snapshots
Fact Table Grain · Concept Challenge · 30 min
- 3
Conform seller performance across processes
Conformed Dimensions · Concept Challenge · 30 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Order models are where normalize-vs-denormalize pressure gets concrete. A single order-header table is tempting for lookup, but it loses item-level measures and product context needed for analytics.
Success signal
Clarification: Clarifies operational lookup paths, analytical metrics, order volume, and whether line-level fulfillment status is required.
- Separate operational order lookup from analytical order-line grain.
- Use degenerate dimensions for high-value operational identifiers.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Inventory can be modeled as movements, snapshots, or both. The stated SLA needs reproducible daily state without replaying every receipt, pick, adjustment, and transfer for every dashboard query.
Success signal
Clarification: Clarifies SLA, warehouse/product grain, snapshot cadence, and whether movements are needed for audit detail.
- Recognize daily inventory as a periodic snapshot fact problem.
- State the warehouse-product-day grain explicitly.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Cross-process performance questions fail when every process owns a lookalike seller table. A conformed seller dimension makes seller health metrics comparable across orders, returns, and disputes.
Success signal
Clarification: Clarifies seller performance metrics, process boundaries, time windows, and marketplace-account identity rules.
- Model separate facts for separate business processes.
- Use a conformed dimension for shared seller context.
Company Tracks: Netflix-style and Stripe-style
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Netflix-style: one atomic viewing-events fact
Fact Table Grain · Concept Challenge · 30 min
- 2
Netflix-style: version title metadata with Type 2
Slowly Changing Dimension Type 2 · Concept Challenge · 25 min
- 3
Netflix-style: model A/B exposure at its own grain
Fact Table Grain · Quick Decision · 5 min
- 4
Stripe-style: payment-attempt fact with a status lifecycle
Fact Table Grain · Concept Challenge · 30 min
- 5
Stripe-style: monthly MRR snapshot with a Type 2 plan
Slowly Changing Dimension Type 2 · Concept Challenge · 30 min
- 6
Stripe-style: model balances as a double-entry ledger
Fact Table Grain · Quick Decision · 5 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
One atomic event fact answers many questions; splitting into a fact per metric duplicates grain decisions and makes cross-metric questions inconsistent. Unique viewers come from distinct-counting a dimension key, not from a second fact.
Success signal
One atomic viewing fact serves both watch-time and unique-viewer questions.
- Choose one atomic event grain over per-metric facts.
- Support unique-viewer questions by distinct-counting dimension keys.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Overwriting metadata makes past viewing look like it had today’s rating or category. A Type 2 title dimension preserves each version with a validity window.
Success signal
The title dimension versions metadata with a validity window.
- Add Type 2 validity columns to the title dimension.
- Keep viewing tied to the title version in effect at view time.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Stamping the variant onto the viewing fact assumes one experiment per view and pollutes its grain. A dedicated exposure fact keeps experiment assignment at its own grain and supports many concurrent experiments.
Success signal
The answer models exposure at its own grain.
- Recognize when experiment assignment needs its own grain.
- Avoid polluting an event fact with experiment columns.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A successful-charges-only table hides failures and retries. An attempt-grain event fact with a status captures the whole lifecycle and supports authorization-rate and retry analysis.
Success signal
Failed and successful attempts are both analyzable.
- Model payment attempts at event grain with a status.
- Keep amount and an attempt timestamp on the fact.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
MRR is state observed monthly — a periodic snapshot. Joined to a Type 2 plan, each month reflects the plan version then in effect; overwriting the plan rewrites history.
Success signal
MRR is stored per account per month.
- Model MRR as a monthly periodic snapshot.
- Protect the account-per-month grain with a key.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
A running-balance column drifts and cannot explain itself. A double-entry ledger — balanced debit and credit entries per transaction — reconciles by construction and is auditable.
Success signal
The answer chooses a double-entry ledger fact.
- Recognize when a double-entry ledger is required.
- Model balances as entries that sum to zero per transaction.
Expert: Enterprise Modeling Synthesis
What this set does
Start with a constrained modeling decision, move into a guided sandbox build, then finish with an open scenario where trade-offs matter more than rote correctness.
- 1
Retail: conform customer and product across orders and returns
Conformed Dimensions · Concept Challenge · 35 min
- 2
Healthcare: encounter accumulating snapshot with Type 2 provider
Fact Table Grain · Concept Challenge · 35 min
- 3
Fintech: double-entry ledger plus conformed daily balances
Conformed Dimensions · Concept Challenge · 35 min
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Real enterprise models combine patterns: conformance makes orders and returns comparable, while a Type 2 product dimension keeps historical price correct. Getting one without the other produces wrong net revenue.
Success signal
Orders and returns share conformed customer and product dimensions.
- Conform customer, product, and date across two processes.
- Version the product dimension as Type 2.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
This combines an accumulating snapshot (milestone timing) with Type 2 history (point-in-time provider context). Either alone gives wrong attribution or no timing.
Success signal
One row per encounter, updated through its milestones.
- Model an encounter accumulating snapshot with milestone dates.
- Protect the one-row-per-encounter grain.
Preview this problem before signing in. Sign in to save progress and submit work.
Why it matters
Auditability comes from the ledger; reporting speed comes from the snapshot. Both must conform to one account dimension or the two views disagree.
Success signal
Balances reconcile from the ledger and report fast from the snapshot.
- Model a double-entry ledger with entry type and signed amounts.
- Add a daily balance periodic snapshot with a protected grain.