Projects > Automatic Database Partitioning

filterKey: project

We present a novel approach to automatically partitioning databases for enterprise-class OLTP systems that significantly extends the state of the art by: (1) minimizing the number distributed transactions, while concurrently mitigating the effects of temporal skew in both the data distribution and accesses, (2) extending the design space to include replicated secondary indexes, (4) organically handling stored procedure routing, and (3) scaling of schema complexity, data size, and number of partitions. This effort builds on two key technical contributions: an analytical cost model that can be used to quickly estimate the relative coordination cost and skew for a given workload and a candidate database design, and an informed exploration of the huge solution space based on large neighborhood search. To evaluate our methods, we integrated our database design tool with a high-performance parallel, main memory DBMS and compared our methods against both popular heuristics and a state-of-the-art research prototype.

Publications:

  • A. Pavlo, C. Curino, and S. Zdonik, "Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems," in SIGMOD ’12: Proceedings of the 2012 international conference on Management of Data, 2012, pp. 61-72. [PDF] [BIBTEX]
    @inproceedings{pavlo2012,
      author = {Pavlo, Andrew and Curino, Carlo and Zdonik, Stanley},
      title = {Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel {OLTP} Systems},
      booktitle = {SIGMOD '12: Proceedings of the 2012 international conference on Management of Data},
      year = {2012},
      isbn = {978-1-4503-1247-9},
      pages = {61--72},
      numpages = {12},
      url = {/papers/hstore-partitioning.pdf},
     }