Data Quality Annotation Using Flags
Concept of Flags
The quality of individual data points, i.e. whether a value is valid, missing, faulty, manually replaced, estimated etc., is annotated in HAKOM TSM per value in the so-called "Flag" information. Flags can be inherited. For example, a formula time series inherits the flags of the underlying time series (referenced in the formula). Flags are also inherited in raster aggregations (for example when aggregating an hourly data for a whole day). In both cases (aggregating multiple time series and or raster aggregations) the worst flag for the same time period dominates all time series value flags involved.
Supported Flags and Their Priorities
The following table shows the list of currently supported flags and their priorities
ID | Name | Inheritance / Aggregation Priority |
---|---|---|
0 | NoValue | 0 |
20 | Accounted | 10 |
5 | Manually Replaced | 20 |
9 | Valid | 30 |
12 | Schedule | 40 |
21 | Estimated | 50 |
7 | Faulty | 60 |
22 | Interpolated | 70 |
19 | Missing | 80 |
The inheritance priority can be different for specific formula functions with configurable flag handling, such as the TSAA function. Additionally, the prioritization of the flags can be also adjusted in the HAKOM.config.
You can read more about overriding of flag priorities here: Configuration - search for keyword ShiftManuallyReplacedPriority.
Flag aggregation examples
Aggregation behaviour of flags in case aggregation (sum) of two time series:
Time stamp | Input time series A | Input time series B | Aggregation result | |||
---|---|---|---|---|---|---|
00:00:00 | 10 | Valid | 0 | Missing | 10 | Missing |
01:00:00 | 20 | Valid | 20 | Accounted | 40 | Valid |
02:00:00 | 30 | Valid | 30 | Valid | 60 | Valid |
Aggregation behaviour of flags in case raster aggregation (sum from 15m to 1h) of one time series:
Time stamp (original) | Input time series | Time stamp (aggregated) | Aggregation result | ||
---|---|---|---|---|---|
00:00:00 | 0 | Missing | 00:00:00 | 60 | Missing |
00:15:00 | 10 | Valid | |||
00:30:00 | 20 | Valid | |||
00:45:00 | 30 | Accounted |
NoValue Flag with aggregation rule "Average"
Upon calculation the average of a dataset, it makes a difference, whether missing zero values are included in the results. The Aggregation rule "Average" includes all values (also zero values with Missing flag, since this flag does not necessarily mean no data exists in the given time range), but values with Flag "NoValue" (i.e. timestamps that were never set to any value) are excluded from the average result.
Time stamp (original) | Input time series | Time stamp (aggregated) | Aggregation result | ||
---|---|---|---|---|---|
00:00:00 | 0 | Missing | 00:00:00 | 10 | Missing |
00:15:00 | 10 | Valid | |||
00:30:00 | 20 | Valid | |||
00:45:00 | 0 | NoValue |
Explanation: the value 10 is being calculated with the following logic: (0 (Missing) + 10 (Valid) + 20 (Valid)) / 3, since the 4th value is a NoValue.
We recommend the following video under Video Tutorials:
- Behaviour of Flags During Data Aggregation