Skip to content
Snippets Groups Projects
Unverified Commit 03373a33 authored by adam-james's avatar adam-james Committed by GitHub
Browse files

Group Axes for Multi-series static viz (#26145)

* First pass at grouping axes

This PR uses the results_metadata key to look at fingerprints for numerical axes to try determine if each series on
the Y axis can be sanely represented on the same axis.

This is done by calculating an overlap (some value between 0 and 1) and grouping all axes on the LEFT if they pass the threshold, which is
some value between 0 and 1. The overlap is always calculated when there is SOME overlap between the ranges of each
axis, and is calculated as:

```clojure
(/ (- (max maximums) (min minimums))
   (- (min maxiumums) (max minimums)))
```

This is done to try catch situations where one column's range is entirely inside the other, but is much smaller (- max
min); such a case would have a small percent overlap by the above calculation, and implies that it might be better to
split the axes.

* Address feedback.

* Fix shape of data in tests of 2 private fns

* Add test for split axes

* Fixed error in test util

* render-utils does with-redef, disallows parallel tests. Makes sense
parent 622b5e6d
No related branches found
No related tags found
No related merge requests found
......@@ -72,9 +72,9 @@
(.write w html-str))
(.deleteOnExit tmp-file)
(open tmp-file)))
(comment
(render-card-to-png 1)
(render-card-to-png 1)
;; open viz in your browser
(-> [["A" "B"]
[1 2]
......
......@@ -512,14 +512,78 @@
[:img {:style (style/style {:display :block :width :100%})
:src (:image-src image-bundle)}]]}))
(defn- overlap
"calculate the overlap, a value between 0 and 1, of the ranges of 2 columns.
This overlap value can be checked against `axis-group-threshold` to determine when columns can reasonably share a y-axis.
Consider two ranges, with min and max values:
min-a = 0 max-a = 43
*-----------------------------------------*
min-b = 52 max-b = 75
*----------------------*
The overlap above is 0. The mirror case where col-b is entirely less than col-a also has 0 overlap.
Otherwise, overlap is calculated as follows:
min-a = 0 max-a = 43
*-----------------------------------------*
| min-b = 8 | max-b = 59
| *---------------------------------|---------------*
| | | |
| |- overlap-width = (- 43 8) = 35 -| |
| |
|--------- max-width = (- 59 0) = 59 ---------------------|
overlap = (/ overlap-width max-width) = (/ 35 59) = 0.59
Another scenario, with a similar result may look as follows:
min-a = 0 max-a = 59
*---------------------------------------------------------*
| min-b = 8 max-b = 43 |
| *---------------------------------* |
| | | |
| |- overlap-width = (- 43 8) = 35 -| |
| |
|--------- max-width = (- 59 0) = 59 ---------------------|
overlap = (/ overlap-width max-width) = (/ 35 59) = 0.59"
[col-a col-b]
(let [[min-a min-b] (map #(get-in % [:fingerprint :type :type/Number :min]) [col-a col-b])
[max-a max-b] (map #(get-in % [:fingerprint :type :type/Number :max]) [col-a col-b])
non-overlapping? (or (and (< max-a min-b) (< max-a max-b))
(and (> min-a min-b) (> min-a max-b)))]
(if non-overlapping?
0
(let [[a b c d] (sort [min-a min-b max-a max-b])
max-width (- d a)
overlap-width (- c b)]
(/ overlap-width max-width)))))
(defn- group-axes
[cols-meta group-threshold]
(let [cols-by-type (group-by (juxt :base_type :effective_type :semantic_type) cols-meta)]
(when (not= (count cols-by-type) (count cols-meta))
(let [num? (fn [[[col-type]]] (isa? col-type :type/Number))
{num-cols true
other-cols false} (-> (group-by num? cols-by-type)
(update-vals #(mapcat second %)))
first-axis (first num-cols)
grouped-num-cols (-> (group-by #(> (overlap first-axis %) group-threshold) num-cols)
(update-keys {true :left false :right}))]
(merge grouped-num-cols {:bottom other-cols})))))
(defn default-y-pos
"Default positions of the y-axes of multiple and combo graphs.
You kind of hope there's only two but here's for the eventuality"
[viz-settings]
[{viz-settings :viz-settings metadata :results_metadata} group-threshold]
(if (:stackable.stack_type viz-settings)
(repeat "left")
(conj (repeat "right")
"left")))
(let [grouped-axes (-> (group-axes (:columns metadata) group-threshold)
(update-vals count))]
(if (seq grouped-axes)
(mapcat (fn [k] (repeat (get grouped-axes k 1) (name k))) [:left :right])
(conj (repeat "right")
"left")))))
(def default-combo-chart-types
"Default chart type seq of combo graphs (not multiple graphs)."
......@@ -548,6 +612,8 @@
:width :100%})
:src (:image-src image-bundle)}]]})
(def ^:private axis-group-threshold 0.33)
(defn- render-multiple-lab-chart
"When multiple non-scalar cards are combined, render them as a line, area, or bar chart"
[render-type card dashcard {:keys [viz-settings] :as data}]
......@@ -570,7 +636,7 @@
colors (take (count multi-data) colors)
types (replace {:scalar :bar} (map :display cards))
settings (->ts-viz x-col y-col labels viz-settings)
y-pos (take (count names) (default-y-pos viz-settings))
y-pos (take (count names) (default-y-pos data axis-group-threshold))
series (join-series names colors types row-seqs y-pos)]
(attach-image-bundle (image-bundle/make-image-bundle render-type (js-svg/combo-chart series settings)))))
......@@ -614,7 +680,7 @@
(defn- single-x-axis-combo-series
"This munges rows and columns into series in the format that we want for combo staticviz for literal combo displaytype,
for a single x-axis with multiple y-axis."
[chart-type joined-rows _x-cols y-cols viz-settings]
[chart-type joined-rows _x-cols y-cols {:keys [viz-settings] :as data}]
(for [[idx y-col] (map-indexed vector y-cols)]
(let [y-col-key (keyword (:name y-col))
card-name (or (series-setting viz-settings y-col-key :name)
......@@ -627,7 +693,7 @@
(nth default-combo-chart-types idx))
selected-rows (mapv #(vector (ffirst %) (nth (second %) idx)) joined-rows)
y-axis-pos (or (series-setting viz-settings y-col-key :axis)
(nth (default-y-pos viz-settings) idx))]
(nth (default-y-pos data axis-group-threshold) idx))]
{:name card-name
:color card-color
:type card-type
......@@ -640,7 +706,7 @@
This mimics default behavior in JS viz, which is to group by the second dimension and make every group-by-value a series.
This can have really high cardinality of series but the JS viz will complain about more than 100 already"
[chart-type joined-rows _x-cols _y-cols viz-settings]
[chart-type joined-rows _x-cols _y-cols {:keys [viz-settings] :as data}]
(let [grouped-rows (group-by #(second (first %)) joined-rows)
groups (keys grouped-rows)]
(for [[idx group-key] (map-indexed vector groups)]
......@@ -655,7 +721,7 @@
chart-type
(nth default-combo-chart-types idx))
y-axis-pos (or (series-setting viz-settings group-key :axis)
(nth (default-y-pos viz-settings) idx))]
(nth (default-y-pos data axis-group-threshold) idx))]
{:name card-name
:color card-color
:type card-type
......@@ -681,8 +747,8 @@
chart-type)
;; NB: There's a hardcoded limit of arity 2 on x-axis, so there's only the 1-axis or 2-axis case
series (if (= (count x-cols) 1)
(single-x-axis-combo-series enforced-type joined-rows x-cols y-cols viz-settings)
(double-x-axis-combo-series enforced-type joined-rows x-cols y-cols viz-settings))
(single-x-axis-combo-series enforced-type joined-rows x-cols y-cols data)
(double-x-axis-combo-series enforced-type joined-rows x-cols y-cols data))
labels (combo-label-info x-cols y-cols viz-settings)
settings (->ts-viz (first x-cols) (first y-cols) labels viz-settings)]
......
......@@ -474,7 +474,7 @@
[[[10.0] [1]] [[5.0] [10]] [[1.25] [20]]]
[{:name "Price", :display_name "Price", :base_type :type/BigInteger, :semantic_type nil}]
[{:name "NumPurchased", :display_name "NumPurchased", :base_type :type/BigInteger, :semantic_type nil}]
{:series_settings {:NumPurchased {:color "#a7cf7b"}}}))))
{:viz-settings {:series_settings {:NumPurchased {:color "#a7cf7b"}}}}))))
(testing "Check if double x-axis combo series can convert colors"
(is (= [{:name "Bob", :color "#c5a9cf", :type "line", :data [[10.0 123]], :yAxisPosition "left"}
{:name "Dobbs", :color "#a7cf7b", :type "bar", :data [[5.0 12]], :yAxisPosition "right"}
......@@ -486,10 +486,10 @@
[{:base_type :type/BigInteger, :display_name "Price", :name "Price", :semantic_type nil}
{:base_type :type/BigInteger, :display_name "NumPurchased", :name "NumPurchased", :semantic_type nil}]
[{:base_type :type/BigInteger, :display_name "NumKazoos", :name "NumKazoos", :semantic_type nil}]
{:series_settings {:Bob {:color "#c5a9cf"}
:Dobbs {:color "#a7cf7b"}
:Robbs {:color "#34517d"}
:Mobbs {:color "#e0be40"}}})))))
{:viz-settings {:series_settings {:Bob {:color "#c5a9cf"}
:Dobbs {:color "#a7cf7b"}
:Robbs {:color "#34517d"}
:Mobbs {:color "#e0be40"}}}})))))
(deftest series-with-custom-names-test
(testing "Check if single x-axis combo series uses custom series names (#21503)"
......@@ -501,8 +501,9 @@
[{:name "Price", :display_name "Price", :base_type :type/Number}]
[{:name "NumPurchased", :display_name "NumPurchased", :base_type :type/Number}
{:name "NumSold", :display_name "NumSold", :base_type :type/Number}]
{:series_settings {:NumPurchased {:color "#a7cf7b" :title "Bought"}
:NumSold {:color "#a7cf7b" :title "Sold"}}}))))))
{:viz-settings
{:series_settings {:NumPurchased {:color "#a7cf7b" :title "Bought"}
:NumSold {:color "#a7cf7b" :title "Sold"}}}}))))))
(testing "Check if double x-axis combo series uses custom series names (#21503)"
(is (= #{"Bobby" "Dobby" "Robby" "Mobby"}
(set (map :name
......@@ -512,10 +513,11 @@
[{:base_type :type/BigInteger, :display_name "Price", :name "Price", :semantic_type nil}
{:base_type :type/BigInteger, :display_name "NumPurchased", :name "NumPurchased", :semantic_type nil}]
[{:base_type :type/BigInteger, :display_name "NumKazoos", :name "NumKazoos", :semantic_type nil}]
{:series_settings {:Bob {:color "#c5a9cf" :title "Bobby"}
:Dobbs {:color "#a7cf7b" :title "Dobby"}
:Robbs {:color "#34517d" :title "Robby"}
:Mobbs {:color "#e0be40" :title "Mobby"}}})))))))
{:viz-settings
{:series_settings {:Bob {:color "#c5a9cf" :title "Bobby"}
:Dobbs {:color "#a7cf7b" :title "Dobby"}
:Robbs {:color "#34517d" :title "Robby"}
:Mobbs {:color "#e0be40" :title "Mobby"}}}})))))))
(defn- render-waterfall [results]
(body/render :waterfall :inline pacific-tz render.tu/test-card nil results))
......@@ -656,6 +658,26 @@
nil "1,234,543.21%"
"" "1,234,543.21%"))
(deftest reasonable-split-axes-test
(let [rows [["Category" "Series A" "Series B"]
["A" 1 1.3]
["B" 2 1.9]
["C" 3 4 ]]
axes-split? (fn [rows]
(let [text (-> rows first last)]
;; there is always 1 node with the series name in the legend
;; so we see if the series name shows up a second time, which will
;; be the axis label, indicating that there is indeed a split
(< 1 (-> rows
(render.tu/make-viz-data :bar {})
:viz-tree
(render.tu/nodes-with-text text)
count))))]
(testing "Multiple series with close values does not split y-axis."
(is (not (axes-split? rows))))
(testing "Multiple series with far values does split y-axis."
(is (axes-split? (conj rows ["D" 3 70]))))))
(deftest ^:parallel x-and-y-axis-label-info-test
(let [x-col {:display_name "X col"}
y-col {:display_name "Y col"}]
......
......@@ -79,6 +79,25 @@
:unit :default
:base_type (guess-type col-sample)}))
(defn- fingerprint
[vals]
{:global {:distinct-count (-> vals distinct count) :nil% 0.0}
:type {:type/Number {:min (apply min vals)
:max (apply max vals)
:avg (double (/ (reduce + vals) (count vals)))}}})
(defn- base-results-metadata
"Create a basic metadata map for a column, which ends up in a vector at [:data :results_metadata :columns].
This mimics the shape of column-settings data returned from the query processor."
[idx col-name col-vals]
(let [ttype (guess-type (first col-vals))]
{:name col-name
:display_name col-name
:field_ref [:field idx {:base-type ttype}]
:base_type ttype
:effective_type ttype
:fingerprint (when (= :type/Number ttype) (fingerprint col-vals))}))
(defn base-viz-settings
[display-type rows]
(let [header-row (first rows)
......@@ -107,13 +126,18 @@
:pivot {}} display-type)))
(defn make-card-and-data
"Make a basic `card-and-data` map for a given `display-type` key. Useful for buildng up test viz data without the need for `viz-scenarios`."
[rows display-type]
{:card {:display display-type
:visualization_settings (base-viz-settings display-type rows)}
:data {:viz-settings {}
:cols (mapv base-cols-settings (range (count (first rows))) (first rows) (second rows))
:rows (vec (rest rows))}})
"Make a basic `card-and-data` map for a given `display-type` key. Useful for buildng up test viz data without the need for `viz-scenarios`.
The `rows` should be a vector of vectors, where the first row is the header row."
[header-and-rows display-type]
(let [[header & rows] header-and-rows
indices (range (count (first header-and-rows)))
cols (mapv (fn [idx] (mapv #(nth % idx) (vec rows))) indices)]
{:card {:display display-type
:visualization_settings (base-viz-settings display-type header-and-rows)}
:data {:viz-settings {}
:cols (mapv base-cols-settings indices header (first rows))
:rows (vec rows)
:results_metadata {:columns (mapv base-results-metadata indices header cols)}}}))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; validate-viz-scenarios
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment