Documentation > Development > Adding New System Statistics

The following documentation describes how to collection new internal statistics from the run time system. H-Store uses a built-in stats management system that is exposed to the outside world using the @Statistics system procedure.

For the purpose of this documentation, we will refer to the component in the DBMS that collects the statistics for a specific aspect of the system as a StatsSource. The sub-system that aggregates multiple StatsSource together (either within a single partition, or at a site site), is known as a StatsAgent. The StatsAgents can be used to coalesce data in a lazy manner without needing to retrieve information directly from the StatsSource each time. All of the StatsAgents are already created in the system for you. If you are reading this documentation because yu want to collect additional information about the DBMS, then you are trying to add a new StatsSource.

There are several considerations that one must make before adding a new StatsSource:

  1. Where in the system is the stats data being generated?
  2. How often should you collect this stats data?
  3. Should the stats always be collected or on demand?

There are two areas in the system where you can collect stats information: (1) the Java front-end layer and (2) the C++ execution engine layer. We will now describe how to add new agents for each of these.

Java Stats Collection

The first step is to create a new class that extends the StatsSource base class. You will then need to hook your new source into the proper component of the overall system. There is no right/wrong way to do this, since it depends on what data you want to collect. In general, it is best to avoid a centralized agent if multiple threads are going to be updating it. For example, if there is information that you want to collect from the PartitionExecutor, then it is best to have a separate source per partition and then aggregate their results only when needed. If the information is global (e.g., the amount memory used by the JVM), then it is sufficient to just use a single source (since multiple threads will not need to write to it).

After you have created your new source class, you will then need to have it register with the StatsAgent at runtime. To do this, you need to create a new entry in the SysProcSelector enum. As an example, let create a new stats source called “TigerStyle” that retrieves information for a single partition. We will want to create our new entry in SysProcSelector like so:

public enum SysProcSelector {
    ...
    TIGER_STYLE // The amount of tiger style in the system!
}

You then must register the source with the HStoreSite‘s global StatsAgent. We will then get the StatsAgent handle from the HStoreSite and register our source:

TigerStyleSource tss = new TigerStyleSource(partitionId);
hstore_site.getStatsAgent()
           .registerStatsSource(SysProcSelector.TIGER_STYLE, partitionId, tss);

The last step is to hook the new SysProcSelector into the @Statistics sysproc so the system knows where to get the information that you need at runtime. You must first add new FragmentId entries into SysProcFragmentId. There will already be existing entries for other stats sources, so you just need to append yours to that list. Make sure that you use a unique id. See the documentation on adding new sysprocs for additional information.

Next, in the Statistics class, you add in the new FragmentIds that you just created to the system. Use the addStatsFragments() helper method to automatically configure everything. This is make it so that the stats request is sent to every partition and then output from each partition is returned back. Be sure to also register the FragmentIds at runtime in initImpl(). Finally, you will need to add the logic to retrieve your data from each partition’s stats source in executePlanFragment().

C++ Stats Collection

For collecting stats from H-Sore’s , all of the infrastructure for copying the data from C++ into Java is in place. You just need to implement a new class that extends StatsSource. To do this, add a new entry to the StatisticsSelectorType in types.h and to the SysProcSelector in Java. It is important that the value of these two entries matches, otherwise you will be unable to connect them together.

For example, suppose that the SysProcSelector.TIGER_STYLE entry from above is the 21st element in the SysProcSelector enum. That means in the C++ StatisticsSelectorType enum you need to make sure that the entry also has index offset of 21:

enum StatisticsSelectorType {
    ...
    STATISTICS_SELECTOR_TYPE_TIGER_STYLE = 21 // The amount of tiger style in the system!
};

Next, you need to register your new StatsSource with the EE’s StatsAgent. This is done in the main class for the execution engine through StatsAgent::registerStatsSource().

Finally, you need to process a stats request in the EE that arrives from the Java layer. There is a single method VoltDBEngine::getStats() that is responsible for retrieving data from the StatsAgent and stores them into a VoltTable. In this method, you will see a switch statement that processes the “selector” variable passed in from Java. This flag corresponds to the StatisticsSelectorType. Thus, you will need to add a case statement to handle the new selector flag that you added.

Retrieving Statistics

To be written…