Batch Planner

When a transaction submits a query batch for execution, H-Store uses its Batch Planner component to create an execution schedule. This schedule specifies what query plans to execute at the different Partition Executors and the order in which they are executed. To do this, the Batch Planner examines each query in the batch and determines the location of the data that they need to access execute on based on the database’s design specification.

The first case is any INSERT, UPDATE, or DELETE query on a replicate table is always broadcast to all of the partitions in the cluster. This is necessary to ensure that each copy of the table is synchronized.

For INSERT queries on non-replicated tables, the Batch Planner simply examines the values of the input parameters that correspond to the target table’s partitioning columns.

For SELECT, UPDATE, and DELETE queries, the Batch Planner examines the query’s WHERE clause to determine what partitions are needed. Regardless of whether a transaction is executing as a single-partition transaction or not, the Batch Planner assumes that queries are all single-partitioned. It selects the the single-partition plan for each query and computes what partitions the operators in the plan touch. If the Batch Planner determines that the query needs to access multiple partitions, then it recomputes the partitions using the query’s multi-partition plan.

The Batch Planner extracts the expression tree for each scan operator (e.g., index look-up, sequential scan) in a query plan. Since each operator only accesses a single table at a time, the Batch Planner traverses the expression tree to find predicates on the target table’s partitioning column(s). Each predicate is a pair where the first element is partitioning column and the second element is either (1) a query input parameter, (2) a constant value, or (3) another column (from the same table or a different table in case of a join). If second elemnt is an input parameter or a constant value, the Batch Planner hashes its value to determine what partition the predicate references. But if the second element of a predicate pair is another column, then the Batch Planner recursively searches the other predicates to find another predicate pair that references an input parameter or a constant value.

For more information, see the documentation on Query Planning & Execution.