Description of recovery scenarios
November 30, 2020 ยท View on GitHub
Setup
- all data is z-score normalized
- normalization occurs before trimming the length for tests where N_test < N
- when percentage or division is mentioned, the result is floored down to nearest integer
Scenarios
N = lentgh of time series M = number of time series
W = 10% * N
I. Batch End
missingpercentage:
- N = max; M = max;
- size of the missing block varies between 10% and 80% of the series; position: end of 1st series.
length:
- M = max; N varies between 20% and 100% of max(N);
- size of the missing block = W; position: end of 1st series.
columns:
- N = max; M = varies between min(10% of columns, 4) and 100% the number of columns;
- size of the missing block = W; position: end of 1st series.
blackout:
- N = max; M = max;
- Missing data varies between 10 and 100 values of each series; position: end of series.
multicol-increasing:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of max(M); size of missing blocks= W; position: at the end of an incomplete time series.
mcar-ts-block*:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of max(M); Missing block: 1 per incomplete series, size missing block = 10% * W, position: random within W from the end.
mcar-ts-block2*:
- same as ts-block, but only every second incomplete time series.
mcar-ts-multiblock*:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of Max(M); sub-blocks of 2% * W are removed at random from each incomplete series until a total of 10% * W of all points in the series are missing. position: random within W from the end.
mcar-ts-multiblock2*:
- same as ts-multiblock, but only every second incomplete time series.
mcar-columns*:
- N = max; M = varies between max(10% of columns, 4) and 100% of columns;
- Missing blocks: 10% * W of every second series total in sub-blocks of 1% * W, start = random within W from the end.
mcar-length*:
- M = max; N varies between 20% and 100% of max(N);
- Missing blocks: 10% * W of every second series total in sub-blocks of 1% * W, start = random within W from the end.
II. Streaming End
missingpercentage:
- N = max; M = max;
- size of the missing block varies between 10% and 80% of the series; position: end of 1st series.
length:
- M = max; N varies between 20% and 100% of max(N);
- size of the missing block = W; position: end of 1st series.
columns:
- N = max; M = varies between min(10% of columns, 4) and 100% the number of columns;
- size of the missing block = W; position: end of 1st series.
blackout:
- N = max; M = max;
- Missing data varies between 10 and 100 values of each series; position: end of series.
multicol-increasing:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of max(M); size of missing blocks= W; position: at the end of an incomplete time series.
mcar-ts-block*:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of max(M); Missing block: 1 per incomplete series, size missing block = 10% * W, position: random within W from the end.
mcar-ts-block2*:
- same as ts-block, but only every second incomplete time series.
mcar-ts-multiblock*:
- N = max; M = max;
- number of incomplete series varies between 10% and 100% of Max(M); sub-blocks of 2% * W are removed at random from each incomplete series until a total of 10% * W of all points in the series are missing. position: random within W from the end.
mcar-ts-multiblock2*:
- same as ts-multiblock, but only every second incomplete time series.
mcar-columns*:
- N = max; M = varies between max(10% of columns, 4) and 100% of columns;
- Missing blocks: 10% * W of every second series total in sub-blocks of 1% * W, start = random within W from the end.
mcar-length*:
- M = max; N varies between 20% and 100% of max(N);
- Missing blocks: 10% * W of every second series total in sub-blocks of 1% * W, start = random within W from the end.
Remarks
* mcar scenario uses random number generator with fixed seed and will produce the same blocks every run