stddev smoketest on product metrics + small changes by andrew-sha · Pull Request #31 · tm41m/scopuli

andrew-sha · 2023-12-05T22:12:41Z

Contains a tests/ directory for custom tests. Inside are two things

There is a generic test which is designed to check timeseries data for records in the column_name column which fall outside stddev_coef standard deviations of the previous lookback records
There is a singular test which essentially implements the above test for the product_timeseries_metrics table

NOTE: The product_timeseries_metrics records have some quirks--i.e. they can be grouped according to different geographic levels (country, province, census division) as well as different units (lbs, /1kg). To perform an actually meaningful test of this kind, the records need to get partitioned by all of these columns. Otherwise, we would end up doing weird stuff like comparing the daily price from a specific census division such as Wellington against the past month's prices in that specific region but also in all of Ontario and all of Canada. I didn't know how to design the generic test to accommodate all these extra dimensions, so I just made a singular test. But, there is also the generic test in case we ever want to use it.

Also, my linter isn't working so I will eventually recommit this when I fix it :(

thecartercodes

I think from a design standpoint, I started to have some ideas.

Let's actually go ahead and calculate the stddev, 30d lookbacks in the model itself i.e. product_timeseries_metrics. Let's also calculate the number of listings above or below the stddev on the calendar_date. Then we can configure a native test in the .yml with severity conditions e.g.

https://docs.getdbt.com/reference/resource-configs/severity

This would solve for the issue you mentioned earlier where you can't partition by the combination of groups but what's more, we can then make these test assertions without windowed clauses in tests and these useful statistics as part of our analytics which I was interested in releasing anyways.

To solve your issue with more of a hack, you can just change the param partition_column_name to partition_clause and generate a liquid expression in the .yml config that concatenates the columns together delimited by , .

andrew-sha · 2023-12-06T18:34:25Z

I think from a design standpoint, I started to have some ideas.

Let's actually go ahead and calculate the stddev, 30d lookbacks in the model itself i.e. product_timeseries_metrics. Let's also calculate the number of listings above or below the stddev on the calendar_date. Then we can configure a native test in the .yml with severity conditions e.g.

https://docs.getdbt.com/reference/resource-configs/severity

This would solve for the issue you mentioned earlier where you can't partition by the combination of groups but what's more, we can then make these test assertions without windowed clauses in tests and these useful statistics as part of our analytics which I was interested in releasing anyways.

To solve your issue with more of a hack, you can just change the param partition_column_name to partition_clause and generate a liquid expression in the .yml config that concatenates the columns together delimited by , .

Im not sure how a severity test would work in this case. A severity test allows you to grade the severity of the failure based on the number of failed rows--i.e. if five rows fail the test you can throw a warning but then if more than 5 fail than the entire test fails. Importantly, I think you still need to define a custom test which will retrieve those failing rows in the first place. If you want each row in product_timeseries_metrics to have a column counting the number of outliar listings, I think we could use the out-of-the-box accepted_values test--i.e. we just put an accepted_values: 0 condition on that new column in the .yml.

thecartercodes · 2023-12-06T19:05:49Z

Im not sure how a severity test would work in this case. A severity test allows you to grade the severity of the failure based on the number of failed rows--i.e. if five rows fail the test you can throw a warning but then if more than 5 fail than the entire test fails. Importantly, I think you still need to define a custom test which will retrieve those failing rows in the first place. If you want each row in product_timeseries_metrics to have a column counting the number of outliar listings, I think we could use the out-of-the-box accepted_values test--i.e. we just put an accepted_values: 0 condition on that new column in the .yml.

Yeah, it would be a custom test with a severity condition.

andrew-sha · 2023-12-11T17:19:09Z

now product_timeseries_metrics has a column that counts the number of listings beyond one stddev of the mean of the price listings within the corresponding geographic region. I didn't add the severity test yet because in principle I'm a bit unsure what it achieves. The number of days days/regions that have some large number of outlier listings doesn't seem like something that should be determined in a test, since I don't think the existence of a lot of outlier listings necessarily indicates a failure in the tech. If anything that might be useful information for the consumer and thus belong outside a test?

thecartercodes · 2023-12-11T19:34:52Z

It'll be useful in a test. To your point though, this word "test" is probably better substituted with "audit". Specifically, the audit should be written to check for the % of outliers relative to total product listings which should also be in the metrics table. Importantly, line 46-50 in product_timeseries_metrics still has that issue we found in an earlier peer program. We can't use the grouping set's resulting conditions (i.e. it's grouped by region_code when it's the column is not null) as part our case condition for collapsing the residues into a single statistic.

inwhanroh · 2023-12-11T20:06:07Z

ganymede/models/wolfram/product_timeseries_metrics.sql

+        , plh.price
+        , abs(price - avg(price) over (partition by acd.val, product_id, region_code, census_division_id, currency, unit)) as residue_cd
+        , abs(price - avg(price) over (partition by acd.val, product_id, region_code, currency, unit)) as residue_re
+        , abs(price - avg(price) over (partition by acd.val, product_id, currency, unit)) as residue_ca


What's the reason for directly calculating the residuals here?

The residuals have to be calculated using those window, but then the average prices need to be calculated using grouping sets. My understanding is that using windows and grouping sets in a singe query requires a bit of finesse so I think it was the simplest logically speaking to have the windows in a subquery that the next subquery can easily refer to via those cases. I suspect there are a few other ways to do it though

andrew-sha · 2023-12-11T20:44:14Z

It'll be useful in a test. To your point though, this word "test" is probably better substituted with "audit". Specifically, the audit should be written to check for the % of outliers relative to total product listings which should also be in the metrics table. Importantly, line 46-50 in product_timeseries_metrics still has that issue we found in an earlier peer program. We can't use the grouping set's resulting conditions (i.e. it's grouped by region_code when it's the column is not null) as part our case condition for collapsing the residues into a single statistic.

What exactly is the issue with lines 46-50? The query seemed to execute fine and also I believe the order of execution in a sql query does grouping before select, so shouldn't those case conditions accurately handle the three different levels of geographic granularity?

thecartercodes · 2023-12-11T23:53:23Z

What exactly is the issue with lines 46-50? The query seemed to execute fine and also I believe the order of execution in a sql query does grouping before select, so shouldn't those case conditions accurately handle the three different levels of geographic granularity?

Yeah, you're right, I didn't think the grouping set's returned results was what was called with the lines 46-50. This also proves the behavior -

with data as (
    select
      *
    from (
        values
            ('ON', 'Div 1', 1)
            , ('ON', 'Div 2', 2)
            , ('ON', 'Div 3', 2)
    ) as t (region_code, census_division, metric)
)
select
  region_code
  , census_division
  , sum(metric)
  , case
      when region_code is null then 'Region Code is null'
      else 'Region Code is not null'
    end as is_region_code_null
  , case
      when census_division is null then 'Census Division is null'
      else 'Census Division is not null'
    end as is_region_code_null
from data
group by grouping sets ((1,2), (1), ())

thecartercodes

lgtm

thecartercodes · 2023-12-18T17:29:29Z

ganymede/models/wolfram/wolfram.yml

          - accepted_values:
-              values: ['NL', 'PE', 'NS', 'NB', 'NB', 'QC', 'ON', 'MB', 'SK', 'AB', 'BC', 'YT', 'NT', 'NU']
+              values: ['NL', 'PE', 'NS', 'NB', 'NB', 'QC', 'ON', 'MB', 'SK', 'AB', 'BC', 'YT', 'NT', 'NU']
+      - name: sum_outside_one_stddev


Suggested change

- name: sum_outside_one_stddev

- name: sum_listings_outside_one_stddev

stddev smoketest on product metrics + small changes

d24f143

andrew-sha requested review from inwhanroh and thecartercodes December 5, 2023 22:12

thecartercodes requested changes Dec 6, 2023

View reviewed changes

andrew-sha closed this Dec 6, 2023

andrew-sha reopened this Dec 6, 2023

new column for number of listings outside one sttdev of mean price

ecc4158

inwhanroh reviewed Dec 11, 2023

View reviewed changes

andrew-sha added 3 commits December 12, 2023 11:01

remove old files

70e802c

percent listings beyond one stddev + severity test

7a79138

severity test that I forgot to add to last commit

309ee0d

andrew-sha requested a review from thecartercodes December 12, 2023 20:51

thecartercodes approved these changes Dec 18, 2023

View reviewed changes

thecartercodes reviewed Dec 18, 2023

View reviewed changes

andrew-sha added 2 commits December 19, 2023 11:09

renamed column

b2f9df8

renamed column in model

64b43fc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stddev smoketest on product metrics + small changes#31

stddev smoketest on product metrics + small changes#31
andrew-sha wants to merge 7 commits intomainfrom
ashannon/scopuli/product_timeseries_metrics_smoketest

andrew-sha commented Dec 5, 2023

Uh oh!

thecartercodes left a comment

Uh oh!

andrew-sha commented Dec 6, 2023

Uh oh!

thecartercodes commented Dec 6, 2023

Uh oh!

andrew-sha commented Dec 11, 2023

Uh oh!

thecartercodes commented Dec 11, 2023

Uh oh!

inwhanroh Dec 11, 2023

Uh oh!

andrew-sha Dec 11, 2023

Uh oh!

andrew-sha commented Dec 11, 2023

Uh oh!

thecartercodes commented Dec 11, 2023

Uh oh!

thecartercodes left a comment

Uh oh!

thecartercodes Dec 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	- name: sum_outside_one_stddev
	- name: sum_listings_outside_one_stddev

Conversation

andrew-sha commented Dec 5, 2023

Uh oh!

thecartercodes left a comment

Choose a reason for hiding this comment

Uh oh!

andrew-sha commented Dec 6, 2023

Uh oh!

thecartercodes commented Dec 6, 2023

Uh oh!

andrew-sha commented Dec 11, 2023

Uh oh!

thecartercodes commented Dec 11, 2023

Uh oh!

inwhanroh Dec 11, 2023

Choose a reason for hiding this comment

Uh oh!

andrew-sha Dec 11, 2023

Choose a reason for hiding this comment

Uh oh!

andrew-sha commented Dec 11, 2023

Uh oh!

thecartercodes commented Dec 11, 2023

Uh oh!

thecartercodes left a comment

Choose a reason for hiding this comment

Uh oh!

thecartercodes Dec 18, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants