Querying Metrics using Python
MetricFlow Query Language (MQL) is a powerful tool that allows you to query metrics cut by different dimensions. The MQLClient.query()
function takes a number of arguments. At minimum, you need to provide it a metric and a dimension.
Usage
Query Examples
Query Metrics and Dimensions
Assume you want to get the gross sales metric by a dimension of country:
from transform import mql
mql.query(["gross_sales"], ["customer__billing_country"]).style.format({'gross_sales':'${0:,.0f}'})
You can optionally include multiple dimensions. Let's say you want to include gross sales by date and country:
#Query Metrics
df = mql.query(["gross_sales"], ["metric_time", "customer__billing_country"])
#Format results and print first 5 rows
df.head().style.format({'gross_sales':'${0:,.0f}'})
metric_time | customer__billing_country | gross_sales | |
---|---|---|---|
0 | 2022-12-04 00:00:00 | FR | $334,621 |
1 | 2022-12-04 00:00:00 | MX | $75,591 |
2 | 2022-12-04 00:00:00 | DE | $88,653 |
3 | 2022-12-04 00:00:00 | IT | $37,408 |
4 | 2022-12-03 00:00:00 | IT | $35,528 |
Order and Limit
Like a SQL function, you can limit the number of rows and also order by a particular field:
df = mql.query(["gross_sales"], ["metric_time", "customer__billing_country"], limit=100, order=["metric_time"]) # Ascending
df.head().style.format({'gross_sales':'${0:,.0f}'})
metric_time | customer__billing_country | gross_sales | |
---|---|---|---|
0 | 2022-03-01 00:00:00 | GR | $1,207 |
1 | 2022-03-01 00:00:00 | CA | $593 |
2 | 2022-03-01 00:00:00 | MX | $680 |
3 | 2022-03-01 00:00:00 | US | $3,274 |
4 | 2022-03-01 00:00:00 | IT | $589 |
df = mql.query(["gross_sales"], ["metric_time", "customer__billing_country"], limit=100, order=["-metric_time"]) # Descending
df.head().style.format({'gross_sales':'${0:,.0f}'})
metric_time | customer__billing_country | gross_sales | |
---|---|---|---|
0 | 2022-12-04 00:00:00 | US | $377,156 |
1 | 2022-12-04 00:00:00 | MX | $75,591 |
2 | 2022-12-04 00:00:00 | CA | $58,512 |
3 | 2022-12-04 00:00:00 | IT | $37,408 |
4 | 2022-12-04 00:00:00 | DE | $88,653 |
Time Constraints
Use Time Constraints to filter the result set by your primary time dimension:
mql.query(["gross_sales"], ["metric_time"], time_constraint="metric_time BETWEEN '2022-05-01' and '2020-05-02'")
gross_sales | metric_time | |
---|---|---|
0 | 47970.65 | 2022-03-02 |
1 | 428903.34 | 2022-03-03 |
2 | 183257.92 | 2022-03-06 |
3 | 89556.60 | 2022-03-09 |
4 | 209231.03 | 2022-03-11 |
... | ... | ... |
57 | 164481.96 | 2022-04-11 |
58 | 263677.89 | 2022-04-24 |
59 | 201816.49 | 2022-04-27 |
60 | 122477.26 | 2022-04-12 |
61 | 193790.40 | 2022-03-25 |
62 rows × 2 columns
Where constraints
You can also use constraints with the where
argument to filter by other dimensions:
mql.query(["gross_sales"], ["metric_time", "customer__billing_country"], where="customer__billing_country = 'CA'")
metric_time | customer__billing_country | gross_sales | |
---|---|---|---|
0 | 2022-12-01 | CA | 59793.59 |
1 | 2022-11-29 | CA | 57243.58 |
2 | 2022-11-28 | CA | 53543.88 |
3 | 2022-11-24 | CA | 50015.17 |
4 | 2022-11-21 | CA | 47505.91 |
... | ... | ... | ... |
274 | 2022-04-03 | CA | 11708.15 |
275 | 2022-03-30 | CA | 5341.41 |
276 | 2022-03-23 | CA | 5420.79 |
277 | 2022-03-08 | CA | 2804.16 |
278 | 2022-03-02 | CA | 2796.46 |
279 rows × 3 columns
Time Granularity and Time over Time
You can optionally specify the time granularity you want your data to be aggregated at by appending double underscore and the unit of granularity you want against your time dimension. Additionally, you can expand that to do time over time comparison at a given granularity. For example, you can compute a monthly granularity and calculate the month over month growth rate. The granularity options are: day, week, month, quarter year. Similarly, the time over time options are, week over week (wow), month over mom (mom), quarter over quarter (qoq), year over year (yoy). The syntax is as follows:
mql.query([metric_name{__wow/mom/qoq/yoy}], [time_dimension_name{__day|week|month|quarter|year}])
The valid pairs for granularity and time over time are as follows:
- wow: day, week
- mom: day, week, month
- qoq: day, week, month, quarter
- yoy: day, week, month, quarter, year
Note: Querying for more than 1 metric is not supported for Time over time.
Here is an example of aggregating gross sales by week where the time dimension is called date
.
mql.query(["gross_sales"], ["metric_time__week"])
metric_time__week | gross_sales | |
---|---|---|
0 | 2022-11-28T00:00:00.000 | 6993233.85 |
1 | 2022-11-21T00:00:00.000 | 6741860.23 |
2 | 2022-11-14T00:00:00.000 | 6556055.47 |
3 | 2022-10-31T00:00:00.000 | 6216251.04 |
4 | 2022-10-24T00:00:00.000 | 5938554.03 |
5 | 2022-10-17T00:00:00.000 | 5821223.87 |
6 | 2022-10-10T00:00:00.000 | 5509385.52 |
7 | 2022-10-03T00:00:00.000 | 5271520.06 |
8 | 2022-09-26T00:00:00.000 | 5460942.28 |
9 | 2022-09-19T00:00:00.000 | 5003801.63 |
10 | 2022-09-12T00:00:00.000 | 4762374.62 |
11 | 2022-09-05T00:00:00.000 | 4563557.54 |
12 | 2022-08-29T00:00:00.000 | 4379722.93 |
13 | 2022-08-22T00:00:00.000 | 4699312.90 |
14 | 2022-08-15T00:00:00.000 | 4000020.56 |
15 | 2022-08-08T00:00:00.000 | 3916561.45 |
16 | 2022-08-01T00:00:00.000 | 3741304.82 |
17 | 2022-07-25T00:00:00.000 | 3426873.65 |
18 | 2022-07-18T00:00:00.000 | 3674038.07 |
19 | 2022-07-11T00:00:00.000 | 3276383.02 |
20 | 2022-07-04T00:00:00.000 | 3124737.81 |
21 | 2022-06-20T00:00:00.000 | 2774803.06 |
22 | 2022-06-13T00:00:00.000 | 2683989.35 |
23 | 2022-06-06T00:00:00.000 | 2640532.55 |
24 | 2022-05-23T00:00:00.000 | 2162888.77 |
25 | 2022-05-09T00:00:00.000 | 2323299.72 |
26 | 2022-04-11T00:00:00.000 | 1463677.42 |
27 | 2022-04-04T00:00:00.000 | 1459069.75 |
28 | 2022-03-28T00:00:00.000 | 1301202.53 |
29 | 2022-03-21T00:00:00.000 | 1137502.91 |
30 | 2022-03-14T00:00:00.000 | 1093956.64 |
31 | 2022-11-07T00:00:00.000 | 6343797.36 |
32 | 2022-06-27T00:00:00.000 | 2892179.74 |
33 | 2022-05-30T00:00:00.000 | 2350410.88 |
34 | 2022-05-16T00:00:00.000 | 2098895.84 |
35 | 2022-05-02T00:00:00.000 | 1862421.43 |
36 | 2022-04-25T00:00:00.000 | 1733056.41 |
37 | 2022-04-18T00:00:00.000 | 1503046.19 |
38 | 2022-03-07T00:00:00.000 | 1529581.26 |
39 | 2022-02-28T00:00:00.000 | 1060844.34 |
Here is an example of calculating a gross sales month over month comparison with a monthly granularity:
df = mql.query(["gross_sales"], ["metric_time__month"])
df.head().style.format({'gross_sales':'${0:,.0f}'})
metric_time__month | gross_sales | |
---|---|---|
0 | 2022-12-01T00:00:00.000 | $4,082,036 |
1 | 2022-11-01T00:00:00.000 | $27,893,815 |
2 | 2022-10-01T00:00:00.000 | $25,027,190 |
3 | 2022-09-01T00:00:00.000 | $20,841,071 |
4 | 2022-08-01T00:00:00.000 | $18,075,368 |
Cache Mode
The interface allows you to control how it is reading and writing from Transform's cache. This is an optional parameter. By default, we will use 'rw' - which indicates we will both read and write from the cache. The options include:
'r'
- When answering queries, use tables in fast cache, materializations, or in the dynamic cache that have the data.
'w'
- Once a result has been computed, whether to write the result into the dynamic cache to read from later.
'rw'
- Combination of read and write.
'i'
- Don't read from any cached tables, or write to any of them.
mql.query(["gross_sales"], ["metric_time"], cache_mode="r")
gross_sales | metric_time | |
---|---|---|
0 | 47970.65 | 2022-03-02 |
1 | 428903.34 | 2022-03-03 |
2 | 183257.92 | 2022-03-06 |
3 | 89556.60 | 2022-03-09 |
4 | 209231.03 | 2022-03-11 |
... | ... | ... |
274 | 282507.75 | 2022-05-06 |
275 | 227874.18 | 2022-05-17 |
276 | 844662.56 | 2022-11-02 |
277 | 917845.05 | 2022-10-30 |
278 | 928501.11 | 2022-11-24 |
279 rows × 2 columns