{"id":1185,"date":"2012-01-08T00:26:19","date_gmt":"2012-01-08T05:26:19","guid":{"rendered":"http:\/\/hstore.cs.brown.edu\/?page_id=1185"},"modified":"2012-03-14T11:50:04","modified_gmt":"2012-03-14T15:50:04","slug":"mapreduce","status":"publish","type":"page","link":"https:\/\/hstore.cs.brown.edu\/documentation\/deployment\/mapreduce\/","title":{"rendered":"MapReduce Transactions"},"content":{"rendered":"<p id=\"top\" \/><div class=\"prevpage\"><B>\u00ab<\/B> <a href=\"https:\/\/hstore.cs.brown.edu\/documentation\/deployment\/jvm-snapshots\/\" title=\"OLAP JVM Snapshots\">OLAP JVM Snapshots<\/a><\/div> <div class=\"nextpage\"><a href=\"https:\/\/hstore.cs.brown.edu\/documentation\/configuration\/\" title=\"Configuration\">Configuration<\/a> <B>\u00bb<\/B><\/div><div style=\"clear:both;\"><\/div><\/p>\n<p>This following document describes how to use new experimental MapReduce-style stored procedures to execute distributed, analytical (OLAP) queries in H-Store.<\/p>\n<p><a name=\"overview\"><\/a><\/p>\n<h2>Overview<\/h2>\n<p>MapReduce-style transactions allow for H-Store to execute analytical queries that access the entire database without having to incur the cost of distributed transaction coordination. When executing normal distributed transactions (i.e., transactions that need to access multiple partitions), H-Store blocks other transactions that are running or waiting in the queue on other partitions and send all data it need to base partition. This has been shown to have a significant impact on the throughput of the whole system. With H-Store&#8217;s MapReduce transactions, the transaction is split into many single-partition transactions that will independently execute in parallel on all partitions. Although it is invoked as a distributed transaction that blocks all partitions, the <b>PartitionExecutor<\/b> will continue to execute non-MapReduce single-partition transactions and better performance has been proven.<\/p>\n<p>When a MapReduce transaction is invoked, the following sequence of phase occurs:<\/p>\n<ol>\n<li><b>Map:<\/b><br \/>\n    The transaction&#8217;s running at the base partition notifies all other partitions that to start the Map phase. The <b>ExecutionSite<\/b> at each partition will then invoke a single-partition transaction that executes the <a href=\"#MapInputQuery\">MapInputQuery<\/a> that retrieves data from its local storage. These transactions run separately and do not need to coordinate with each other. These records are then passed to the <a href=\"#Map\">Map()<\/a> method of the MapReduce stored procedure.<\/p>\n<li><b>Shuffle:<\/b><br \/>\n    After the single-partition Map transaction finishes, it will automatically begin the shuffle phase in a separate, non-blocking thread. Data that has the same key which is defined in MapReduce stored procedure will be sent to the same destination (i.e. partition) from <em>MapOutputTable<\/em> to the input of the <a href=\"#Reduce\">Reduce()<\/a> method.<\/p>\n<li><b>Reduce:<\/B><br \/>\n    After sending all the data to its destination, it will move to Reduce phase where the tranaction&#8217;s running at the base partition notifies all other partitions that to begin the Reduce phase. This is similar to Map phase when <em>ReduceInputTable<\/em> is prepared ready. Tuples in ReduceInputTable will be sorted by the key which is the first column by default. Tuples with the same key are then passed to the <em>Reduce<\/em> method of the MapReduce stored procedure. The output of the Reduce method at each partition is coalesced at the base partition for the transaction and sent back to the client.\n<\/ol>\n<p>Note that H-Store will automatically partition the shuffle data using the first key of the Map&#8217;s emit table.<\/p>\n<p><a name=\"mapreduceAPI\"><\/a><\/p>\n<h2>MapReduce API<\/h2>\n<p>Just as with regular transactions in H-Store, all MapReduce transactions must be pre-defined as stored procedures in the benchmarks. To create a new MapReduce stored procedure, one must create a new Java class that extends the <a href=\"https:\/\/github.com\/apavlo\/h-store\/blob\/master\/src\/frontend\/org\/voltdb\/VoltMapReduceProcedure.java\" class=\"source-java\">VoltMapReduceProcedure<\/a> abstract class. Users must implement map and reduce these two abstract functions, these will be talked about next.Each implementation of <tt>VoltMapReduceProcedure<\/tt> needs to include the following <b>six<\/b> components:<\/p>\n<ol>\n    <a name=\"ProcInfo\"><\/a><\/p>\n<li><b>ProcInfo<\/b>:<br \/>\n    Before the definition of the class, it is a must to define the MapInputQuery ProcInfo. Internal system will read this name and match with it to know the user defined query next. It should be defined in MapReduce Stored Procedure like next:<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\">@ProcInfo<span style=\"color: #009900;\">&#40;<\/span>\n    mapInputQuery <span style=\"color: #339933;\">=<\/span> <span style=\"color: #0000ff;\">&quot;mapInputQuery&quot;<\/span>\n<span style=\"color: #009900;\">&#41;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<p>    <a name=\"MapInputQuery\"><\/a><\/p>\n<li><b>MapInputQuery<\/b>:<br \/>\n    This is the query that is executed when the transaction starts and provides the input data to the <a href=\"#Map\">Map<\/a> method. The input parameters to this query are used as the input parameters to the transaction. This query is always executed as a local, single-partition query on each partition in the cluster that is executing the Map phase for the transaction. Besides, MapInputQuery that will get the input data for MapReduce job must not be null and users should write SQL-like query for it like next:<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\"><span style=\"color: #000000; font-weight: bold;\">public<\/span> SQLStmt mapInputQuery <span style=\"color: #339933;\">=<\/span> <span style=\"color: #000000; font-weight: bold;\">new<\/span> SQLStmt<span style=\"color: #009900;\">&#40;<\/span>\n    <span style=\"color: #0000ff;\">&quot;SELECT A_NAME, COUNT(*) FROM TABLEA WHERE A_AGE &gt;= ? GROUP BY A_NAME&quot;<\/span> <span style=\"color: #666666; font-style: italic;\">\/\/ the &quot;?&quot; is the input parameter<\/span>\n<span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #339933;\">;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<p>    <a name=\"MapOutputSchema\"><\/a><\/p>\n<li><b>VoltTable.ColumnInfo[] getMapOutputSchema()<\/b>:<br \/>\n    This defines what the output table schema of the <a href=\"#Map\">Map<\/a> method that will be used as input to the <a href=\"#Reduce\">Reduce<\/a> method.<\/tt>. The data will automatically be sent to a particular partition for the <a href=\"#Reduce\">Reduce<\/a> phase based on the hash value of the first column in the output table by default. MapOutput table schema can be defined like next:<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\">@Override\n<span style=\"color: #000000; font-weight: bold;\">public<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span> getMapOutputSchema<span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#123;<\/span>\n    <span style=\"color: #000000; font-weight: bold;\">return<\/span> <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span><span style=\"color: #009900;\">&#123;<\/span>\n        <span style=\"color: #666666; font-style: italic;\">\/\/ this is the key that will be hashed by default, the key type here is String<\/span>\n        <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #0000ff;\">&quot;NAME&quot;<\/span>, VoltType.<span style=\"color: #006633;\">STRING<\/span><span style=\"color: #009900;\">&#41;<\/span>, \n        <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #0000ff;\">&quot;COUNTER&quot;<\/span>, VoltType.<span style=\"color: #006633;\">BIGINT<\/span><span style=\"color: #009900;\">&#41;<\/span>,\n    <span style=\"color: #009900;\">&#125;<\/span><span style=\"color: #339933;\">;<\/span>\n<span style=\"color: #009900;\">&#125;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<p>    <a name=\"Map\"><\/a><\/p>\n<li><b>Map(VoltTableRow row)<\/b>:<br \/>\n    The <tt>Map()<\/tt> is invoked for each record (i.e. tuple) returned by the <a href=\"#MapInputQuery\">MapInputQuery<\/a>. It performs some unit of processing on that row and can pass output for further processing by the <a href=\"#Reduce\">Reduce<\/a> method by invoking <tt>VoltMapReduceProcedure.mapEmit()<\/tt>. This output will be passed into the <tt>Shuffle<\/tt> method after the <tt>Map<\/tt> finishes. Map function for a name counter MapReduce job can be defined like next:<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\">@Override\n<span style=\"color: #000000; font-weight: bold;\">public<\/span> <span style=\"color: #000066; font-weight: bold;\">void<\/span> map<span style=\"color: #009900;\">&#40;<\/span>VoltTableRow row<span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#123;<\/span>\n    <span style=\"color: #003399;\">String<\/span> key <span style=\"color: #339933;\">=<\/span> row.<span style=\"color: #006633;\">getString<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #cc66cc;\">0<\/span><span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #339933;\">;<\/span> <span style=\"color: #666666; font-style: italic;\">\/\/ get key from column 0 by default<\/span>\n    <span style=\"color: #003399;\">Object<\/span> new_row<span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span> <span style=\"color: #339933;\">=<\/span> <span style=\"color: #009900;\">&#123;<\/span> key, row.<span style=\"color: #006633;\">getLong<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#125;<\/span><span style=\"color: #339933;\">;<\/span>  <span style=\"color: #666666; font-style: italic;\">\/\/ this row will be insert into mapOutput table<\/span>\n    <span style=\"color: #000000; font-weight: bold;\">this<\/span>.<span style=\"color: #006633;\">mapEmit<\/span><span style=\"color: #009900;\">&#40;<\/span>key, new_row<span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #339933;\">;<\/span> <span style=\"color: #666666; font-style: italic;\">\/\/ mapOutputTable, Emit the intermediate data<\/span>\n<span style=\"color: #009900;\">&#125;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<p>    <a name=\"ReduceOutputSchema\"><\/a><\/p>\n<li><b>VoltTable.ColumnInfo[] getReduceOutputSchema()<\/b>:<br \/>\n    This is very similar to <a href=\"#MapOutputSchema\">MapOutputSchema<\/a>.It defines what the output table schema of the <a href=\"#Reduce\">Reduce<\/a> method that will be sent back to the client. MapOutputSchema and ReduceOutputSchema can be very different for many other cases although they are more or less the same here.<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\">@Override\n<span style=\"color: #000000; font-weight: bold;\">public<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span> getReduceOutputSchema<span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#123;<\/span>\n    <span style=\"color: #000000; font-weight: bold;\">return<\/span> <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span><span style=\"color: #009900;\">&#123;<\/span>\n        <span style=\"color: #666666; font-style: italic;\">\/\/  The first column that is the key should be the same with the MapOutputSchema<\/span>\n        <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #0000ff;\">&quot;NAME&quot;<\/span>, VoltType.<span style=\"color: #006633;\">STRING<\/span><span style=\"color: #009900;\">&#41;<\/span>,\n        <span style=\"color: #000000; font-weight: bold;\">new<\/span> VoltTable.<span style=\"color: #006633;\">ColumnInfo<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #0000ff;\">&quot;COUNTER&quot;<\/span>, VoltType.<span style=\"color: #006633;\">BIGINT<\/span><span style=\"color: #009900;\">&#41;<\/span>,\n    <span style=\"color: #009900;\">&#125;<\/span><span style=\"color: #339933;\">;<\/span>\n<span style=\"color: #009900;\">&#125;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<p>    <a name=\"Reduce\"><\/a><\/p>\n<li><b>Reduce(Key k, Iterator&lt;VoltTableRow&gt; row)<\/b>:<br \/>\n    The <tt>Reduce()<\/tt> is invoked at each partition for processing the output data of the Shuffle method. There is not any need for users to care about the Shuffle function and the internal data send part. The input for reduce function is so well prepared by internal system that tuples with the same key can be accessed by the Iterator. This Iterator implements the Iterable interface but does nothing for remove() method.<\/p>\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"java\" style=\"font-family:monospace;\">@Override\n<span style=\"color: #000000; font-weight: bold;\">public<\/span> <span style=\"color: #000066; font-weight: bold;\">void<\/span> reduce<span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #003399;\">String<\/span> key, Iterator<span style=\"color: #339933;\">&lt;<\/span>VoltTableRow<span style=\"color: #339933;\">&gt;<\/span> rows<span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#123;<\/span>\n    <span style=\"color: #000066; font-weight: bold;\">long<\/span> count <span style=\"color: #339933;\">=<\/span> <span style=\"color: #cc66cc;\">0<\/span><span style=\"color: #339933;\">;<\/span>\n    <span style=\"color: #000000; font-weight: bold;\">for<\/span> <span style=\"color: #009900;\">&#40;<\/span>VoltTableRow r <span style=\"color: #339933;\">:<\/span> CollectionUtil.<span style=\"color: #006633;\">iterable<\/span><span style=\"color: #009900;\">&#40;<\/span>rows<span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #009900;\">&#41;<\/span> <span style=\"color: #009900;\">&#123;<\/span>\n            count<span style=\"color: #339933;\">+<\/span> <span style=\"color: #339933;\">=<\/span> <span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #000066; font-weight: bold;\">long<\/span><span style=\"color: #009900;\">&#41;<\/span>r.<span style=\"color: #006633;\">getLong<\/span><span style=\"color: #009900;\">&#40;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #339933;\">;<\/span>\n    <span style=\"color: #009900;\">&#125;<\/span> <span style=\"color: #666666; font-style: italic;\">\/\/ FOR<\/span>\n    <span style=\"color: #003399;\">Object<\/span> new_row<span style=\"color: #009900;\">&#91;<\/span><span style=\"color: #009900;\">&#93;<\/span> <span style=\"color: #339933;\">=<\/span> <span style=\"color: #009900;\">&#123;<\/span>key, count<span style=\"color: #009900;\">&#125;<\/span><span style=\"color: #339933;\">;<\/span>\n    <span style=\"color: #000000; font-weight: bold;\">this<\/span>.<span style=\"color: #006633;\">reduceEmit<\/span><span style=\"color: #009900;\">&#40;<\/span>new_row<span style=\"color: #009900;\">&#41;<\/span><span style=\"color: #339933;\">;<\/span><span style=\"color: #666666; font-style: italic;\">\/\/ reduceOutput table<\/span>\n<span style=\"color: #009900;\">&#125;<\/span><\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<\/ol>\n<p>Last but not the least, the iterator should be also talked about. This part will be added later.<\/p>\n<p>Finally, there is a very simple MapReduce stored procedure class doing name counter called <a href=\"https:\/\/database.cs.brown.edu\/svn\/hstore\/branches\/mapreduce-branch\/tests\/frontend\/edu\/brown\/benchmark\/mapreduce\/procedures\/MockMapReduce.java\" class=\"source-java\">MockMapReduce<\/a> for reference. This is the real demo implementation code for a simple MapReduce Stored Procedure.<\/p>\n<p><a name=\"evaluation\"><\/a><\/p>\n<h2>Evaluation<\/h2>\n<ol>\n<li> Distributed tested query 1 and test result figure can be seen next.\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"sql\" style=\"font-family:monospace;\"><span style=\"color: #993333; font-weight: bold;\">SELECT<\/span> ol_number<span style=\"color: #66cc66;\">,<\/span><span style=\"color: #993333; font-weight: bold;\">SUM<\/span><span style=\"color: #66cc66;\">&#40;<\/span>ol_quantity<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">,<\/span>\n<span style=\"color: #993333; font-weight: bold;\">SUM<\/span><span style=\"color: #66cc66;\">&#40;<\/span>ol_amount<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">,<\/span>AVG<span style=\"color: #66cc66;\">&#40;<\/span>ol_quantity<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">,<\/span>\nAVG<span style=\"color: #66cc66;\">&#40;<\/span>ol_amount<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">,<\/span><span style=\"color: #993333; font-weight: bold;\">COUNT<\/span><span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #66cc66;\">*<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #993333; font-weight: bold;\">FROM<\/span> order_line\n<span style=\"color: #993333; font-weight: bold;\">GROUP<\/span> <span style=\"color: #993333; font-weight: bold;\">BY<\/span> ol_number\n<span style=\"color: #993333; font-weight: bold;\">ORDER<\/span> <span style=\"color: #993333; font-weight: bold;\">BY<\/span> ol_number<\/pre><\/td><\/tr><\/table><\/div>\n\n<p><a href=\"https:\/\/hstore.cs.brown.edu\/wordpress\/wp-content\/uploads\/2012\/01\/query1.png\" rel=\"lightbox[1185]\"><img loading=\"lazy\" src=\"https:\/\/hstore.cs.brown.edu\/wordpress\/wp-content\/uploads\/2012\/01\/query1-300x160.png\" alt=\"Query tested for MapReduce transaction\" title=\"Query tested for MapReduce transaction\" width=\"300\" height=\"160\" class=\"alignright size-medium wp-image-1407\" srcset=\"https:\/\/hstore.cs.brown.edu\/wordpress\/wp-content\/uploads\/2012\/01\/query1-300x160.png 300w, https:\/\/hstore.cs.brown.edu\/wordpress\/wp-content\/uploads\/2012\/01\/query1-150x80.png 150w, https:\/\/hstore.cs.brown.edu\/wordpress\/wp-content\/uploads\/2012\/01\/query1.png 693w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/li>\n<p>This query is a good example to show MapReduce transaction has better performance than H-Store normal distributed transaction. The X axis in is the Deployment of H-Store and the Y axis is the execution time of running Query1 in milliseconds.<\/p>\n<p>It\u2019s clearly to see that the MapReduce Transaction will keep better performance as the number of partition<br \/>\nincreases. Partition doubles does not indicates the input data doubles. However, we could know that the performance of MapReduce transaction will not declinegreatly than normal distributed transaction. Normal Distributed transaction will block other transactions on other partitions and send the data it needs to the base partition to do the aggregate operations which takes amount of time. MapReduce will treat it as many single partition transaction executed on each partition.<\/p>\n<p>After shuffle phase, all the data are well parted on each partition to do aggregate operation locally and partially, which seems to divide this task into every partition to do instead of a single base partition. So the MapReduce transaction will definitely have better performance across the cluster.<\/p>\n<p>More interesting evaluation work will be put out soon(Contact me if you are interested: xin at cs.brown.edu)&#8230;\n<\/li>\n<\/ol>\n<p><a name=\"futurework\"><\/a><\/p>\n<h2>Future Work<\/h2>\n<ol>\n<li>It would be nice to run two or more table JOIN with many aggregate operation queries. Right now the H-Store system does not support this or I am not sure how to use it in H-Store way. The query may be like query next:\n\n<div class=\"wp_syntax\"><table><tr><td class=\"code\"><pre class=\"sql\" style=\"font-family:monospace;\"><span style=\"color: #993333; font-weight: bold;\">SELECT<\/span> ol_number<span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #993333; font-weight: bold;\">SUM<\/span><span style=\"color: #66cc66;\">&#40;<\/span>ol_amount<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">,<\/span> AVG<span style=\"color: #66cc66;\">&#40;<\/span>ol_quantitiy<span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #993333; font-weight: bold;\">FROM<\/span> order_line<span style=\"color: #66cc66;\">,<\/span> item \n<span style=\"color: #993333; font-weight: bold;\">WHERE<\/span> order_line<span style=\"color: #66cc66;\">.<\/span>ol_i_id <span style=\"color: #66cc66;\">=<\/span> item<span style=\"color: #66cc66;\">.<\/span>i_id \n<span style=\"color: #993333; font-weight: bold;\">GROUP<\/span> <span style=\"color: #993333; font-weight: bold;\">BY<\/span> ol_number <span style=\"color: #993333; font-weight: bold;\">ORDER<\/span> <span style=\"color: #993333; font-weight: bold;\">BY<\/span> ol_number<\/pre><\/td><\/tr><\/table><\/div>\n\n<\/li>\n<li> We would really like to how the data input scale affect the performance of these two kinds of transactions. By increasing the cluster scale with data input, I may test and prove the first conclusion in summary.<\/li>\n<li> Measure the effect of throughput on TPC-C instead of the executing time of these two kinds of transactions.<\/li>\n<li> More details about the project can be found on this page: <a href=\"https:\/\/github.com\/apavlo\/h-store\/issues\/22\">github H-Store<\/a><\/li>\n<\/ol>\n<p><div class=\"prevpage\"><B>\u00ab<\/B> <a href=\"https:\/\/hstore.cs.brown.edu\/documentation\/deployment\/jvm-snapshots\/\" title=\"OLAP JVM Snapshots\">OLAP JVM Snapshots<\/a><\/div> <div class=\"nextpage\"><a href=\"https:\/\/hstore.cs.brown.edu\/documentation\/configuration\/\" title=\"Configuration\">Configuration<\/a> <B>\u00bb<\/B><\/div><div style=\"clear:both;\"><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This following document describes how to use new experimental MapReduce-style stored procedures to execute distributed, analytical (OLAP) queries in H-Store. Overview MapReduce-style transactions allow for H-Store to execute analytical queries that access the entire database without having to incur the cost of distributed transaction coordination. When executing normal distributed transactions (i.e., transactions that need to [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"parent":626,"menu_order":9999,"comment_status":"closed","ping_status":"open","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/pages\/1185"}],"collection":[{"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/comments?post=1185"}],"version-history":[{"count":61,"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/pages\/1185\/revisions"}],"predecessor-version":[{"id":1195,"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/pages\/1185\/revisions\/1195"}],"up":[{"embeddable":true,"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/pages\/626"}],"wp:attachment":[{"href":"https:\/\/hstore.cs.brown.edu\/wp-json\/wp\/v2\/media?parent=1185"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}