Documentation > Development > Transaction Prediction Models

The following is instructions on how to generate and use H-Store’s Markov models for a benchmark workload. This will allow H-Store’s TransactionEstimator component to predict what transactions will do before they execute and then apply the correct optimizations automatically.

Need to discuss difference between global and clustered models…

Generating Models

The models are specific to the number of partitions in the cluster, therefore you need to generate new models for each new cluster configuration.

We will first create a cluster for the TPC-C benchmark with two partitions:

ant hstore-prepare -Dproject=tpcc -Dhosts=localhost:0:0

You can construct markov file with the markov-generate command with this workload:

ant markov-generate -Dproject=tpcc \
    -Dworkload=files/workloads/tpcc.8p-1.trace.gz \
    -Dglobal=false \

You can use the following command to create Markov models for multiple cluster sizes and then gzip them:

export benchmark=seats
for partitions in 8 16 32; do
  ant hstore-prepare markov-generate \
       -Dproject=${benchmark} \
       -Dhosts=localhost:0:0-$(expr $partitions - 1) \
       -Dglobal=false \
       -Dworkload=${benchmark}-combined.trace.gz \
       -Doutput=files/markovs/vldb-june2013/${benchmark}-${partitions}p.markov || break 
  gzip -v --force --best files/markovs/vldb-june2013/${benchmark}-${partitions}p.markov

Executing Benchmarks using Prediction Models

The H-Store supplemental files repository contains several pre-compute Markov models for the built-in benchmarks. To start the H-Store cluster using a set of Markov models, execute the hstore-benchmark target with the site.markov_enable and site.markov_path parameters:

ant hstore-prepare -Dproject=tpcc -Dhosts=localhost:0:0-5
ant hstore-benchmark -Dproject=tpcc \
    -Dsite.markov_enable=true \

You can also have the BenchmarkController recompute the probabilities for all of the Markov models at each partition and save them to a file after a benchmark run using the markov.recompute_end option. Note that the site.markov_singlep_updates and site.markov_dtxn_updates parameters should be set to true so that the vertexes in each model is updated as transactions execute:

ant hstore-benchmark -Dproject=tpcc \
    -Dsite.markov_enable=true \
    -Dsite.markov_path=files/markovs/vldb-august2012/tpcc-6p.markov.gz \
    -Dsite.markov_singlep_updates=true \
    -Dsite.markov_dtxn_updates=true \

Extracting & Visualizing Markov Graphs

You can extract individual Markov models and generate GraphViz compatible file of its structure using the markov-graphviz command. This will write out individual files for each Procedure to the directory defined by global.temp_dir. The procedure parameter is a comma-separated list of the Procedures that you want to extract. The partition is what partition to use when retrieving the Markov models (when using a non-global Markov set).

ant markov-graphviz -Dproject=smallbank \
    -Dmarkov=files/markovs/vldb-august2012/smallbank-2p.markov.gz \
    -Dprocedure="SendPayment,Amalgamate" \

Use GraphViz’s dot tool to generate a PNG (or any other image format) for the extracted Markov file:

dot -Tpng -o SendPayment.png

Additional Information

See the 2011 VLDB paper for a more thorough discussion of this research.

  • A. Pavlo, E. P. C. Jones, and S. Zdonik, "On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems," Proc. VLDB Endow., vol. 5, pp. 85-96, 2011. [PDF] [BIBTEX]
      author = {Pavlo, Andrew and Jones, Evan P.C. and Zdonik, Stanley},
      title = {On Predictive Modeling for Optimizing Transaction Execution in Parallel {OLTP} Systems},
      journal = {Proc. VLDB Endow.},
      volume = {5},
      issue = {2},
      month = {October},
      year = {2011},
      pages = {85--96},
      publisher = {VLDB Endowment},
      url = {},