What are the identifiers in the data source yaml?
Identifiers are the join key columns in your data source that can be used to join to other data sources.
Within a data source, the required parameters for an identifier are
name refers to either the key’s column name from the underlying data table, or the
name may serve as an alias with the column name referenced in the
The identifier types in Transform are primary, foreign, or unique. Identifiers can also be single identifiers or composite identifiers. Identifiers can also be used as a dimension allowing you to aggregate a metric to the granularity of that identifier.
Note that the MetricFlow framework refers to the identifiers (join keys) in a data source via the
name parameter. Within a data source, identifier names must be unique (you cannot have repeated identifier names within a data source just as you cannot have repeated column names within a data table). However, identifier names do not have to be unique across different data sources.
Transform uses the identifier's type to determine how to join data sources. The join logic depends on the type of identifier. Check out our join logic page for more info on how Transform joins data sources.
Primary: A primary key has one and only one record for each row in the table, and it is also inclusive of every record in the data warehouse.
Unique: A unique key has one and only one record for each row in the table, but it may only have a subset of every record in the data warehouse. They can also have nulls.
Foreign: A foreign key can have zero, one or many of the same records. They can also have nulls.
- name: transaction
- name: order
- name: user
expr: SUBSTRING(id_order FROM 2)
MetricFlow supports configuring data sources that contain composite keys (a combination of two or more columns to uniquely identify an entity occurrence or a table row). Composite keys can be defined in the identifiers section of the data source, just like the single identifiers.
Note: Composite keys can only include primary or foreign identifiers. Unique identifiers cannot be included in a composite key.
For the following example file containing a composite key, assume:
- Suppose we have a table
people.usersthat consists of all users (denoted by the field name
id) and the teams that they belong to (denoted by the field
- In the example configuration file below, we have defined the single identifiers (
id), and then created a composite identifier (
user_team) that refers to the
idcolumns that are combined to uniquely identify a particular user-team row combination.
description: Users and their teams
- name: team_id
- name: id
- name: user_team # Composite key created
type: primary # Composite keys can also be of foreign type
- ref: team_id
- ref: id
There is no limit to the number of identifiers that can be combined into your composite key. Common examples of setting up composite keys include event logs, where tables often require the combination of 2 or more columns (i.e. timestamp, machine_id, event_type) to generate a unique identifier key.
Note: MetricFlow will never implicitly create a composite key. For example, if another data source named
users_v2 also happens to have the identifiers
id, MetricFlow will not assume that the data source
users_v2 can be joined with data source
users via the composite key
user_team unless explicitly defined.
You do not have to explicitly pre-define all identifiers that make up your composite key. Our recommended approach is to consider whether the particular field can stand alone as an identifier, or whether it is only usable as part of a composite key.
Example: The data source
users below has a composite key named
user_message. In the composite key, there is a reference to the identifier
message, which has been previously defined on its own. Notice that the composite key
user_message also references another identifier named
user, which is instead defined within the composite identifier.
description: Users and messages sent
- name: message
- name: user_message # Composite Key Created
type: primary # Composite keys can also be foreign
- name: user
- ref: message
Querying Composite Keys in MetricFlow
In MetricFlow, you can query using a where clause by specifying each specific field that makes up your composite key. For example, if your composite key is made up of
message_id, you can run a query such as.
mf query --metrics messages --dimensions metric_time --where "user_id = 10 and team_id = 7"
mql query --metrics messages --dimensions metric_time --where "user_id = 10 and team_id = 7"