Heroic Documentation

Data Model

A series is identified by key, and a unique set of tags and resource identifiers.

Tags are indexed data stored with a given time-series. Tags can be used with both filters and aggregations.

Resource Identifiers are additional, non-indexed data stored with a given time-series. These were added to help address issues with ephemerality. In a system where "host" (for example) can change often, one may want to store "host" as a resource tag to avoid losing time-series data every time "host" changes. Currently, Resource Identifiers can be used with aggregations but not filters.

Keys, tag keys, and tag values can contain any valid unicode string, internally they are stored in UTF-8.

A series represents something over time, where something is currently a set of data points.

Points

com.spotify.heroic.metric.Point

Each data point stores the timestamp at which they were sampled, and the value which they carry.

The timestamp is stored as a 64-bit number (long), which represents the number of milliseconds since the unix epoch.

The value is stored as a 64-bit floating point number (double).

A data point is then typically represented as a JSON array, with two elements.


[<timestamp>, <value>]

Semantic Series

We strongly encourage the concept of semantic series.

The idea behind semantic series is to move away from obscure identifiers and introduce metrics that are structured in a way that makes it easier for a human and a computer to reason about.

So a series like the idle cpu utilization for a host could be identified as the following.


{
  "key": "system",
  "tags": {
    "site": "gew",
    "what": "cpu-idle-percentage",
    "system-component": "cpu",
    "cpu-type": "idle",
    "unit": "%"
  },
  "resource": {
    "podname": "pod-example-123-abc",
    "host": "database.example.com"
  }
}

This can also be represented in a more compact, human readable format as below.


system { host=database.example.com, site=lon, what=cpu-idle-percentage, ... }

The need for semantic metrics becomes more apparent when you start to reason about how to model series for certain use cases using a traditional, hierarchical model.

Assume that the above series was stored in a hierarchical time series database like the following.

database.example.com.system.cpu-idle-percentage

The lack of keys makes deciphering a hierarchy challenging.
The growth in the number of branches in the hierarchy becomes an organizational burden.
Growth in the number of series limits discovery.
The structure of the hierarchy determines how things are discovered.
The filtering, or selecting of series is limited (e.g. wildcard).

By promoting the use of tags, and a convention over which tags should be used how, the problem becomes more manageable.

Instead of a strict hierarchy where discovery and expression is limited, you can have a multi-dimensional system that enables strong correlations and natural groupings.

Conversely, if a given convention is followed, an administrator learning what a specific tag like what means will find it easier to navigate unknown contexts where that tag is used.

References

Metrics 2.0 "An emerging set of conventions, standards and concepts around timeseries metrics metadata" by Dieter Plaetinck