Skip to content
Snippets Groups Projects
Unverified Commit 6b8ddc84 authored by dpsutton's avatar dpsutton Committed by GitHub
Browse files

Semantic types 2 effective type (#15022)


* First pass using coercions

* Coercions

* Handle effective_type coercion_strategy in test data sets

* special-type -> semantic type in sample db

```clojure
user> (def config (metabase.db.spec/h2 {:db (str "/Users/dan/projects/clojure/metabase/resources/sample-dataset.db"
                                                 ";UNDO_LOG=0;CACHE_SIZE=131072;QUERY_CACHE_SIZE=128;COMPRESS=TRUE;"
                                                 "MULTI_THREADED=TRUE;MVCC=TRUE;DEFRAG_ALWAYS=TRUE;MAX_COMPACT_TIME=5000;"
                                                 "ANALYZE_AUTO=100")}))
user> (jdbc/execute! config ["UPDATE _metabase_metadata
                        SET keypath = 'PEOPLE.ZIP.semantic_type'
                        WHERE keypath = 'PEOPLE.ZIP.special_type'" ])
[1]
user> (jdbc/execute! config ["UPDATE _metabase_metadata
                        SET keypath = 'REVIEWS.BODY.semantic_type'
                        WHERE keypath = 'REVIEWS.BODY.special_type'" ])
[1]
```

* Correct mismatch in validation preventing sync

* fixing up alternative date tests

* More passing tests

* Tests for values, nested queries, fetch metadata

* tests

* tests passing

* Fixup mongo qp for coercions

locally i have some failing tests that are off by 1 errors:

Fail in compile-time-interval-test

[36m:mongo[0m Make sure time-intervals work the way they're supposed to. [:time-interval $date -4 :month] should give us something like Oct 01 2020 - Feb 01 2021 if today is Feb 17 2021

expected: [{$match {$and [{:$expr {$gte [$date {:$dateFromString {:dateString 2020-10-01T00:00Z}}]}} {:$expr {$lt [$date {:$dateFromString {:dateString 2021-02-01T00:00Z}}]}}]}} {$group {_id {date~~~day {:$let {:vars {:parts {:$dateToParts {:date $date}}}, :in {:$dateFromParts {:year $$parts.year, :month $$parts.month, :day $$parts.day}}}}}}} {$sort {_id 1}} {$project {_id false, date~~~day $_id.date~~~day}} {$sort {date~~~day 1}} {$limit 1048576}]

  actual: [{"$match"
            {"$and"
             [{:$expr {"$gte" ["$date" {:$dateFromString {:dateString "2020-11-01T00:00Z"}}]}}
              {:$expr {"$lt" ["$date" {:$dateFromString {:dateString "2021-03-01T00:00Z"}}]}}]}}
           {"$group"
            {"_id"
             {"date~~~day"
              {:$let
               {:vars {:parts {:$dateToParts {:date "$date"}}},
                :in {:$dateFromParts {:year "$$parts.year", :month "$$parts.month", :day "$$parts.day"}}}}}}}
           {"$sort" {"_id" 1}}
           {"$project" {"_id" false, "date~~~day" "$_id.date~~~day"}}
           {"$sort" {"date~~~day" 1}}
           {"$limit" 1048576}]
    diff: - [{"$match"
              {"$and"
               [{:$expr {"$gte" [nil {:$dateFromString {:dateString "2020-10-01T00:00Z"}}]}}
                {:$expr {"$lt" [nil {:$dateFromString {:dateString "2021-02-01T00:00Z"}}]}}]}}]
          + [{"$match"
              {"$and"
               [{:$expr {"$gte" [nil {:$dateFromString {:dateString "2020-11-01T00:00Z"}}]}}
                {:$expr {"$lt" [nil {:$dateFromString {:dateString "2021-03-01T00:00Z"}}]}}]}}]

* ee fixes

* UI to set coercion type

* Don't need to populate effective-type here

it actually has knock on effects:
- does more work now as almost every field has an update to do in
`add-extra-metadata`
- we have databases that have state that we don't create. druid for
example has stuff to mimic the dataset in tqpt/with-flattened-dbdef on
checkins but we don't actually create this. And our dbdef has a field
called "date" that is not present in the druid db, so if we attempt to
add metadata it fails and kills the rest of the metadata that we add.
- tests need this metadata to be present and the error causes field
visibilities (for example) to not be set

* Docstrings on shared lib

* Add effective and coercion to redshift expectations

* Fixup google analytics

* Derecordize instead of recordize the expectation

object details didn't work out well here. they added way more stuff
from the db than what is flowing through here.

```clojure
  actual: {:field
           {:name "DATE",
            :parent_id nil,
            :table_id 69,
            :base_type :type/Date,
            :effective_type :type/Date,
            :coercion_strategy nil,
            :semantic_type nil},
           :value {:type :date/all-options, :value "past5days"}}
    diff: - {:field
             {:description nil,
              :database_type "VARCHAR",
              :fingerprint_version 0,
              :has_field_values nil,
              :settings nil,
              :caveats nil,
              :fk_target_field_id nil,
              :custom_position 0,
              :active true,
              :last_analyzed nil,
              :position 1,
              :visibility_type :normal,
              :preview_display true,
              :database_position 0,
              :fingerprint nil,
              :points_of_interest nil}}
```

Object defaults adds quite a bit of stuff such that we'd be dissoc'ing
more than we are currently adding in

Co-authored-by: default avatarCam Saul <1455846+camsaul@users.noreply.github.com>
parent 6b4930fc
Branches
Tags
No related merge requests found
Showing
with 244 additions and 58 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment