I have a time series data about daily usage of a computer program, here is an example
- 2017-11-10: 0
- 2017-11-09: 14
- 2017-11-08: 0
- 2017-11-07: 6
- 2017-11-06: 102
- 2017-11-05: 0
- 2017-11-04: 0
As you can see 11-06 has a spike at 102. Due to our way of gathering this data, we know that data is probably erroneous and we are sure that 102 is not correct according other values.
So we need to clean these dirty values.
Is there a mathematical way to do this? Is there a python lib to help us?