Time Series Compression

What Is Time Series Compression

In HAKOM terminology time series compression is a data storage technology where individual time series values are cascaded to fixed sized blocks that are saved binary encoded to the database. The size of the blocks depends on the resolution of the underlying data.

What Data Is Compressed

In case of conventional - not compressed time series, each data row consists of a timestamp-value-flag pair. To the contrary compressed time series does not save for each value the timestamp information, only adjacent value-flag pairs are combined to a block, that allows us to save data storage and reduce database index size:

Block Sizes

Raster with common (predefined) interval length

Interval length	Block size
Quarter hour	Day
Half hour	Day
Hour	Day
Day	Month
Week	Year
Month	Year
Quarter	Year
Year	Year
Half year	Year

Raster with interval length in multiples of seconds

Intervall length	Block size
< 60 Seconds	60 Seconds
< 900 Seconds	Hour
>= 900 Seconds	Day

Compression Availability

Compression is supported for all synchronous intervals, such as "Begin" and "End" time series, where a fixed number of intervals can be represented by a compressed block.

For asynchronous intervals, such as "Spontaneous" time series, several time stamps would be missing from a compressed block, filling the missing stamps to create ensure blocks integrity would be counterproductive so that compression technology does not provide any benefit and as such, for asynchronous time series compression is currently not supported.

How to Enable Compression

To enable compression for specific time series follow the below instructions.

Compression should be enabled when creating a given time series, by the latest before saving the first data on the given time series.

Note

If quoation is enabled for a time series with already existing data, existing data (if necessary) need to be migrated into quotation enabled tables.

TSM App:

Click on Load in the TSM Ribbon
Press the Search... button in the Time series area of the TSM window
Search for the required time series by entering search parameters and pressing Search button
Select the time series in the results grid
Select the Edit tab in the Time Series Search window > the definition of the selected time series is being loaded
Select Compression checkbox in Standard section > the according compression tables (for data, historical and quotation data tables) will be set automatically
Press the Save button

WebTSM Services API:

In any time series definition path (/repositories/:repository/timeseries or /repositories/:repository/timeseriescollections/definition) set property Compression to true in the definition Body

XML

{
    "Name": "MyFirstTimeSeries",
    "Type": 2,
    "Interval": {
        "Value": "Minute",
        "Multiplier": 15
    },
    "Unit": "KWh",
	"Compression": true
}

Why Use Compression

Compression if properly utilized provides the following advantages:

Reduced database storage usage.
Example: a quarter hourly time series can potentially save 80% of the storage used by a non-compressed format.
Increased data read speed.
Example: in case of a quarter hourly time series data for one year the data rate is up to 20% higher compared to non-compressed format.
Massively increased data write speed.
Example: in case of a quarter hourly time series data for one year the data rate is up to 80% higher compared to non-compressed format.

Compression Use Cases

When time series values are accessed in the similar (or same) time blocks as used in compression, the performance gain is the highest. E.g if in case of quarter hourly or hourly time series whole days, months or years will provide the best results.

In cases where only partitions of a compressed block is accessed (for example a quarter hour of a quarter hourly time series), compression will not lead to a performance gain, to the contrary due to a small portion of overhead costs for compressing/decompressing a full block, a non-compressed time series will perform slightly better.

Our Recommendation

Compression will provide a performance gain, if at least 75% of at least one block can be filled / will be read in one request. If this is not the case (e.g. if you usually write one value at once in a given time series), compression costs will be slightly higher (around 10% of a very short processing time) compared to non-compressed time series.

We recommend to use compressed time series in following cases:

If storage space utilization has a high importance
To boost save operations with volumes higher than 75% of a compression block
To boost read operations with volumes higher than 75% of a compression block