Time Series Compression
What Is Time Series Compression
In HAKOM terminology time series compression is a data storage technology where individual time series values are cascaded to fixed sized blocks that are saved binary encoded to the database. The size of the blocks depends on the resolution of the underlying data.
What Data Is Compressed
In case of conventional - not compressed time series, each data row consists of a timestamp-value-flag pair. To the contrary compressed time series does not save for each value the timestamp information, only adjacent value-flag pairs are combined to a block, that allows us to save data storage and reduce database index size:
Block Sizes
Raster with common (predefined) interval length
Interval length | Block size |
---|---|
Quarter hour | Day |
Half hour | Day |
Hour | Day |
Day | Month |
Week | Year |
Month | Year |
Quarter | Year |
Year | Year |
Half year | Year |
Raster with interval length in multiples of seconds
Intervall length | Block size |
---|---|
< 60 Seconds | 60 Seconds |
< 900 Seconds | Hour |
>= 900 Seconds | Day |
Compression Availability
Compression is supported for all synchronous intervals, such as "Begin" and "End" time series, where a fixed number of intervals can be represented by a compressed block.
For asynchronous intervals, such as "Spontaneous" time series, several time stamps would be missing from a compressed block, filling the missing stamps to create ensure blocks integrity would be counterproductive so that compression technology does not provide any benefit and as such, for asynchronous time series compression is currently not supported.
How to Enable Compression
To enable compression for specific time series follow the below instructions.
Compression should be enabled when creating a given time series, by the latest before saving the first data on the given time series.
Note
If quoation is enabled for a time series with already existing data, existing data (if necessary) need to be migrated into quotation enabled tables.
TSM App:
- Click on Load in the TSM Ribbon
- Press the Search... button in the Time series area of the TSM window
- Search for the required time series by entering search parameters and pressing Search button
- Select the time series in the results grid
- Select the Edit tab in the Time Series Search window > the definition of the selected time series is being loaded
- Select Compression checkbox in Standard section > the according compression tables (for data, historical and quotation data tables) will be set automatically
- Press the Save button
WebTSM Services API:
In any time series definition path (/repositories/:repository/timeseries
or /repositories/:repository/timeseriescollections/definition
) set property Compression to true in the definition Body
{
"Name": "MyFirstTimeSeries",
"Type": 2,
"Interval": {
"Value": "Minute",
"Multiplier": 15
},
"Unit": "KWh",
"Compression": true
}
Why Use Compression
Compression if properly utilized provides the following advantages:
- Reduced database storage usage.
Example: a quarter hourly time series can potentially save 80% of the storage used by a non-compressed format. - Increased data read speed.
Example: in case of a quarter hourly time series data for one year the data rate is up to 20% higher compared to non-compressed format. - Massively increased data write speed.
Example: in case of a quarter hourly time series data for one year the data rate is up to 80% higher compared to non-compressed format.
Compression Use Cases
When time series values are accessed in the similar (or same) time blocks as used in compression, the performance gain is the highest. E.g if in case of quarter hourly or hourly time series whole days, months or years will provide the best results.
In cases where only partitions of a compressed block is accessed (for example a quarter hour of a quarter hourly time series), compression will not lead to a performance gain, to the contrary due to a small portion of overhead costs for compressing/decompressing a full block, a non-compressed time series will perform slightly better.
Our Recommendation
Compression will provide a performance gain, if at least 75% of at least one block can be filled / will be read in one request. If this is not the case (e.g. if you usually write one value at once in a given time series), compression costs will be slightly higher (around 10% of a very short processing time) compared to non-compressed time series.
We recommend to use compressed time series in following cases:
- If storage space utilization has a high importance
- To boost save operations with volumes higher than 75% of a compression block
- To boost read operations with volumes higher than 75% of a compression block