Measures
Measures are the aggregations that can result in either your final metric of desire, and/or serve as the foundational building blocks for a metric.
Measures have a few inputs in the data source configuration: name
, description
, aggregation
, expr
, and create_metric
.
Name (Required)
The name
of the measure can be chosen by the user or optionally reference a database column directly. If the name
of the measure is different than the column in the table, you must add an expr
that will denote the column name. The name
of the measure will be referenced when creating a metric. Note that measure names must be unique across all data sources in a project.
Description (Required)
The description section describes the measure that is being calculated. It is strongly recommended to create verbose and human-readable descriptions in this field.
Aggregation (Required)
The aggregation determines how the field will be aggregated. For example, a sum
aggregation type over a granularity of day
would sum the values across a given day.
Aggregations supported by MetricFlow are as such:
Aggregation Types | Description |
---|---|
sum | Sum across the values |
min | Minimum across values |
max | Maximum across values |
average | Average across values |
sum_boolean | A sum for a boolean type |
count_distinct | Distinct count of values |
tip
All of the metrics above, except for count_distinct, are additive measures. MetricFlow can aggregate these agg types sequentially to reuse datasets. As an example, transactions by day could be aggregated to transactions by week. Non-additive measures like count_distinct don't have this property and must be calculated from the source every time. This can be more costly and slower, but the resulting metrics are worth the wait in many cases.
Expr (Optional)
Use the expr
parameter if you have specified a name
parameter that does not match the name of a column in your database table (in this case, name
would effectively serve as an alias to refer to your measure). You may also use the expr
parameter with any valid SQL to manipulate an underlying column name into a specific output.
Note: Always refer to your database-specific SQL when utilizing SQL functions to be used in the expr
parameter as outputs may differ depending on your specific database provider.
caution
If you are using Snowflake and use any week-level function inside the expr
parameter, the function will now return the ISO-standard default of MONDAY.
Examples:
You are running Snowflake warehouse instances with account or session level overrides for the WEEK_START parameter that fix it to a value other than 0 or 1: You will now see MONDAY as the week start in all cases.
You are using the DAYOFWEEK function inside the
expr
parameter with the legacy Snowflake default of WEEK_START = 0: DAYOFWEEK will now return the ISO-standard values of 1 (Monday) through 7 (Sunday) instead of Snowflake’s legacy default values of 0 (Monday) through 6 (Sunday).
Create Metric (optional)
You can optionally choose to create a metric directly from a measure with create_metric: True
as a shortcut for creating a metric.
- Note that the description and owners of the metric created from a measure with
create_metric: True
will thus default to the description and owner of the data source. - You may further specify the display name of the metric via
create_metric_display_name: "Your Metric Display Name"
Note: If you prefer to have a single file location or a single mechanism for storing all of your metric definitions, we do not recommend you use the create_metric: True
shortcut as this allows metric definitions to exist in multiple places across your Transform model.
Example file using various aggregation types
data_source:
name: transactions
description: A record for every transaction that takes place. Carts are considered multiple transactions for each SKU.
owners: support@transform.co
sql_table: schema.transactions
# --- IDENTIFIERS ---
identifiers:
- name: transaction_id
type: primary
- name: customer_id
type: foreign
- name: store_id
type: foreign
- name: product_id
type: foreign
# --- MEASURES ---
measures:
- name: transaction_amount_usd
description: Total USD value of transactions
agg: sum
- name: transaction_amount_usd_avg
description: Average USD value of transactions
expr: transaction_amount_usd
agg: average
- name: transaction_amount_usd_max
description: Maximum USD value of transactions
expr: transaction_amount_usd
agg: max
- name: transaction_amount_usd_min
description: Minimum USD value of transactions
expr: transaction_amount_usd
agg: min
- name: quick_buy_transactions
description: The total transactions bought as quick buy
expr: quick_buy_flag
agg: sum_boolean
- name: distinct_transactions_count
description: Distinct count of transactions
expr: transaction_id
agg: count_distinct
- name: transactions
description: The average value of transactions
expr: transaction_amount_usd
agg: average
create_metric: True
create_metric_display_name: Avg Value of Transactions (Daily)
- name: transactions_amount_usd_valid #Notice here how we use expr to compute the aggregation based on a condition
description: The total USD value of valid transactions only
expr: CASE WHEN is_valid = True then 1 else 0 end
agg: sum
- name: transactions
description: The average value of transactions.
expr: transaction_amount_usd
agg: average
create_metric: True
create_metric_display_name: Avg Value of Transactions (Daily)
# --- DIMENSIONS ---
dimensions:
- name: ds
type: time
expr: date_trunc('day', ts) #expr refers to underlying column ts
type_params:
is_primary: true
time_granularity: day
- name: is_bulk_transaction
type: categorical
expr: case when quantity > 10 then true else false end