Skip to content
Snippets Groups Projects
Unverified Commit d2c6d11c authored by Mark Bastian's avatar Mark Bastian Committed by GitHub
Browse files

Fixing bad pulses when bad params exist (#39136)

* Fixing bad pulses when bad params exist

Previously, when a dashboard parameter was removed that was depended upon by a subscription, the subscription went into a broken state that was unfixable (the removed param was still attached to the notification but not visible in the UI for editing). Rather than remove that parameter, which could potentially lead to data leaks, we just archive the pulse.

Now, when a dashboard is saved that would break a pulse/subscription, we archive the pulse and send an email the the dashboard and pulse creators to notify them that the pulse has been removed. The email also contains which params were removed as well as a list of who will no longer be receiving the messages.

One code consideration when writing the query logic to determine the blast radius of the bad pulse was whether to write a large join query then reorganize the results or do N+1 queries starting from the pulse id (The logic requires hitting all of pulse, pulse channel, pulse channel recipients, and user tables). I started with a full join query along the lines of what is shown below, but opted for the N+1 query as broken pulses should be a very rare occurrence so readability was preferred over performance.

```clojure
  (let [parameter-ids ["a3d043d5"]]
    (when (seq parameter-ids)
      (t2/query
        (sql/format
          {:with   [[:params {:select [[:id]
                                       [[:raw "json_array_elements(parameters::json) ->> 'id'"]
                                        :parameter_id]]
                              :from   :pulse}]
                    [:pulse_ids {:select-distinct [[:id :pulse_id]]
                                 :where           [:in :parameter_id parameter-ids]
                                 :from            :params}]]
           :select [[:p.id :pulse_id]
                    [:pc.id :pulse_channel_id]
                    [:pcr.id :pulse_channel_recipient_id]
                    [:creator.first_name :creator_first_name]
                    [:creator.last_name :creator_last_name]
                    [:creator.email :creator_email]
                    [:recipient.first_name :recipient_first_name]
                    [:recipient.last_name :recipient_last_name]
                    [:recipient.email :recipient_email]]
           :from   [[:pulse :p]]
           :join   [:pulse_ids [:= :p.id :pulse_ids.pulse_id]
                    [:pulse_channel :pc] [:= :pc.pulse_id :p.id]
                    [:pulse_channel_recipient :pcr] [:= :pcr.pulse_channel_id :pc.id]
                    ;; Get the pulse creator
                    [:core_user :creator] [:= :creator.id :p.creator_id]
                    ;; Get the pulse recipient
                    [:core_user :recipient] [:= :recipient.id :pcr.user_id]]}))))
```

* Fixing inconsistent test results by adding a timeout and expecting 2 messages.

* Fixing NPEs in tests, but not sure why we are not getting results sometimes.

* Trying to fix 500 in dashboard PUT

* Tests run ok locally against mongodb

* Fixing unit tests

* fixing template up

* fixing unit tests

* Adding refined tests to try to detect what is going on with failing tests

* Formatting

* Tracking down test failures and updating email template

* Removing db-specific query

* Simplifying broken pulse logic to not get all pulses twice.

* A couple small simplifications to the logic.

* Adding slack messaging

* Making pulse checks conditional on parameter updates
parent dd6916ef
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment