Activity Schema

SINGLE MODELING LAYER

Model: one table with three core columns

1. Timestamp
2. Customer
3. Activity

Activities are building blocks that use consistent definitions and standard structures across the whole company. Examples include, started session, paid invoice, opened email.

Learn about activities

Diagram showing a single table (the activity stream) with plots depending on it

New way to query

Query: Temporal Joins (NEW)

Temporal joins allow quickly asking questions across different data sources.

Temporal joins will append activities for each subquery making it adapt to whichever data is needed.

90% of all questions can be answered with this schema.

Learn about temporal joins

A real model

Designed for the reality of data

The Activity Schema was designed assuming that data is messy: foreign keys don't exist, duplication in the source, multiple identifiers, etc...

One Table

One table allows everyone to know where the data is. It is simple to find and simple to discover all new data sources

Standard Schema

Using the table is super easy since it is a standard structure. You can easily query the data without learning what all the columns mean

Single Layer Dependency

Raw data goes into the Activity Stream and then that is used to generate any table. Everything you need is built on top of this one table making data more reliable and trusted.

Foreign keyless

Data in the Activity Stream can be related to each other using time and customer. No need to pre-define all the joins

Quick to build

Building it only requires you to define your business logic. This is usually on one source and is really easy.

Flexible

Not only is building anything possible using the Activity Schema, but changing, adding and deleting is done with ease.

Join the community



Founder story

This is why I started OpLevel

Having worked in data for many years, I found that every question we answered started from scratch. We constantly had to think about what we need to build, the columns that we should add, the ways a table would be used.
‍
I needed a standard that can work across all companies, that can answer any question, that can capture any data structure. I needed the Activity Schema.

Check out OpLevel.ai



Single Source Of Truth

Data is modeled using independent activities.

One Table

All warehouse data is in a single time-series table

Single Model Layer

All plots and analyses for BI run against a single table

POWERFUL and FLEXIBLE

A Different Way to Model

An Activity Schema data model structures all data in the warehouse as a single time series table.

Data is built from independent activities via temporal joins, instead of from staging tables via standard SQL joins.

Any activity can be combined with any other by using relationships in time instead of foreign keys, allowing for true ad hoc queries.

Learn More

Diagram showing many data tables with dependencies between each other, with a set of plots referring on them

Model Dependencies

Traditional Modeling

Existing data modeling approaches, such as a star schema, have many layers of dependencies.

These are difficult to manage and maintain. The source of truth is not always clear, they are harder to debug, and require more documentation to use.

SINGLE MODELING LAYER

Activity Schema

An Activity Schema transforms source tables into a single, time series table called an activity stream. All downstream plots, tables, materialized views, etc used for BI are built directly from that single table, with no other dependencies.‍

Read the spec

For data engineers

Maintainable Models

Fewer models

One business concept per activity means fewer models to manage, understand, and maintain

Easier to build

No joins between models means no need to tie disparate source systems together

Quickly accomodate source data changes

Changes to source data typically only affect a single activity

Simple data lineage

A single data layer makes tracing data provenance and debugging far easier

Faster updates

Time-series modeling means incremental updates (rather than full rebuilds) by default

No data dictionaries

Fewer models, with one concept each, makes them vastly easier to document

POWERFUL and FLEXIBLE

Faster Analysis and Querying

Single source of truth

Each activity represents a single concept (like a 'page view' or 'completed order'), so it's always clear which to use

Query across all data sources

Time-based joins means any activity can be queried and combined with another without defining foreign keys

Reusable analyses

A standard data model means that any analysis can be reused across companies. A customer acquisition cost calculation for one company can be shared with another.

Autogenerated queries

A standard data model means that queries don't have to be written by hand

True ad-hoc querying

Because all activities are related in time, swapping one activity for another requires no structural changes to queries.

High performance

Queries run substantially faster against an activity stream table, which has fewer columns, requires fewer joins, and can be easily partitioned by activity or time

A global standard for modeling and querying data

Model: one table with three core columns

Query: Temporal Joins (NEW)

Answer anything across different business domains

Designed for the reality of data

One Table

Standard Schema

Single Layer Dependency

Foreign keyless

Quick to build

Flexible

This is why I started OpLevel

Single Source Of Truth

One Table

Single Model Layer

A Different Way to Model

Traditional Modeling

Activity Schema

Maintainable Models

Fewer models

Easier to build

Quickly accomodate source data changes

Simple data lineage

Faster updates

No data dictionaries

Faster Analysis and Querying

Single source of truth

Query across all data sources

Reusable analyses

Autogenerated queries

True ad-hoc querying

High performance