Skip to content
Snippets Groups Projects
  • dpsutton's avatar
    a15fc4ea
    Fix deadlock in pivot table connection management (#22981) · a15fc4ea
    dpsutton authored
    Addresses part of https://github.com/metabase/metabase/issues/8679
    
    Pivot tables can have subqueries that run to create tallies. We do not
    hold the entirety of resultsets in memory so we have a bit of an
    inversion of control flow: connections are opened, queries run, and
    result sets are transduced and then the connection closed.
    
    The error here was that subsequent queries for the pivot were run while
    the first connection is still held open. But the connection is no longer
    needed. But enough pivots running at the same time in a dashboard can
    create a deadlock where the subqueries need a new connection, but the
    main queries cannot be released until the subqueries have completed.
    
    Also, rf management is critical. It's completion arity must be called
    once and only once. We also have middleware that need to be
    composed (format, etc) and others that can only be composed
    once (limit). We have to save the original reducing function before
    composition (this is the one that can write to the download writer, etc)
    but compose it each time we use it with `(rff metadata)` so we have the
    format and other middleware. Keeping this distinction in mind will save
    you lots of time. (The limit query will ignore all subsequent rows if
    you just grab the output of `(rff metadata)` and not the rf returned
    from the `:rff` key on the context.
    
    But this takes the following connection management:
    
    ```
    tap> "OPENING CONNECTION 0"
    tap> "already open: "
      tap> "OPENING CONNECTION 1"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 1"
      tap> "OPENING CONNECTION 2"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 2"
      tap> "OPENING CONNECTION 3"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 3"
    tap> "CLOSING CONNECTION 0"
    ```
    
    and properly sequences it so that connection 0 is closed before opening
    connection 1.
    
    It hijacks the executef to just pass that function into the reducef part
    so we can reduce multiple times and therefore control the
    connections. Otherwise the reducef happens "inside" of the executef at
    which point the connection is closed.
    
    Care is taken to ensure that:
    - the init is only called once (subsequent queries have the init of the
    rf overridden to just return `init` (the acc passed in) rather than
    `(rf)`
    - the completion arity is only called once (use of `(completing rf)` and
    the reducing function in the subsequent queries is just `([acc] acc)`
    and does not call `(rf acc)`. Remember this is just on the lower
    reducing function and all of the takes, formats, etc _above_ it will
    have the completion arity called because we are using transduce. The
    completion arity is what takes the volatile rows and row counts and
    actually nests them in the `{:data {:rows []}` structure. Without
    calling that once (and ONLY once) you end up with no actual
    results. they are just in memory.
    Fix deadlock in pivot table connection management (#22981)
    dpsutton authored
    Addresses part of https://github.com/metabase/metabase/issues/8679
    
    Pivot tables can have subqueries that run to create tallies. We do not
    hold the entirety of resultsets in memory so we have a bit of an
    inversion of control flow: connections are opened, queries run, and
    result sets are transduced and then the connection closed.
    
    The error here was that subsequent queries for the pivot were run while
    the first connection is still held open. But the connection is no longer
    needed. But enough pivots running at the same time in a dashboard can
    create a deadlock where the subqueries need a new connection, but the
    main queries cannot be released until the subqueries have completed.
    
    Also, rf management is critical. It's completion arity must be called
    once and only once. We also have middleware that need to be
    composed (format, etc) and others that can only be composed
    once (limit). We have to save the original reducing function before
    composition (this is the one that can write to the download writer, etc)
    but compose it each time we use it with `(rff metadata)` so we have the
    format and other middleware. Keeping this distinction in mind will save
    you lots of time. (The limit query will ignore all subsequent rows if
    you just grab the output of `(rff metadata)` and not the rf returned
    from the `:rff` key on the context.
    
    But this takes the following connection management:
    
    ```
    tap> "OPENING CONNECTION 0"
    tap> "already open: "
      tap> "OPENING CONNECTION 1"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 1"
      tap> "OPENING CONNECTION 2"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 2"
      tap> "OPENING CONNECTION 3"
      tap> "already open: 0"
      tap> "CLOSING CONNECTION 3"
    tap> "CLOSING CONNECTION 0"
    ```
    
    and properly sequences it so that connection 0 is closed before opening
    connection 1.
    
    It hijacks the executef to just pass that function into the reducef part
    so we can reduce multiple times and therefore control the
    connections. Otherwise the reducef happens "inside" of the executef at
    which point the connection is closed.
    
    Care is taken to ensure that:
    - the init is only called once (subsequent queries have the init of the
    rf overridden to just return `init` (the acc passed in) rather than
    `(rf)`
    - the completion arity is only called once (use of `(completing rf)` and
    the reducing function in the subsequent queries is just `([acc] acc)`
    and does not call `(rf acc)`. Remember this is just on the lower
    reducing function and all of the takes, formats, etc _above_ it will
    have the completion arity called because we are using transduce. The
    completion arity is what takes the volatile rows and row counts and
    actually nests them in the `{:data {:rows []}` structure. Without
    calling that once (and ONLY once) you end up with no actual
    results. they are just in memory.
Code owners
Assign users and groups as approvers for specific file changes. Learn more.