Price repair fixes and improvement
Fixes:
- fix reconstruction mis-calibration with tiny DataFrames
- fix detecting last-active-trading-interval when NaNs in DataFrame
- redesign mapping 100x signals to ranges:
- no change for signals before last-active-trading-interval
- but for signals after last-active-trading-interval, process in reverse order
Improvements:
- increase max reconstruction depth from 1 to 2. E.g. now 1wk can be repaired with 1h (1wk->1d->1h)
Several improvements to price repair
Repair 100x and split errors:
- Handle stocks that recently suspended - use latest ACTIVE trading as baseline
- Improve error identification:
- Restrict repair to no older than 1 year before oldest split
- To reduce false positives when checking for multiday split errors,
only analyse 'Open' and 'Close' and use average change instead of nearest-to-1
- For weekly intervals reduce threshold to 3x standard deviation (5x was too high),
and for monthly increase to 6x
- For multiday intervas, if errors only detected in 1 column then assume false positive => ignore
Repair missing div-adjust:
- Fix repair of multiday intervals containing dividend
Price reconstruction:
- Move to after repairing 100x and split errors, so calibration works properly
- Fix maximum depth and reduce to 1
- Restrict calibration to 'Open' and 'Close', because 'Low' and 'High' can differ significantly between e.g. 1d and day-of-1h
Miscellaneous:
- Deprecate repair='silent', the logging module handles this
- Improve tests for 100x and split errors
- New test for 'repair missing div adjust'
Improve split-repair of multi-day intervals. Because split error can occur within a multi-day interval, e.g. mid-way through week, need to repair each OHLC column separately
Increase robustness of repair 'Adj Close'
Limit price-repair recursion depth to 2
Main change is fixing price-repair of 1d 'Adj Close'. 1d repair uses 1h data, but 1h is never div-adjusted. For correct 1d 'Adj Close', extract div-adjustment from the good 1d data, and calculate it for any bad 1d data. A new unit test ensures correctness.
Other changes:
- bug fix in split-repair logic to handle price=0
- improve unit test coverage on price dividend
- add 1wk interval to split-repair unit test
Stumbled upon another type of 100x price error - Yahoo may switch a symbol from e.g. cents -> $ on some recent date, so recent prices are 100x different. The new split-repair is perfect for this - set change to 100 and ignore 'Volume'.
An out-of-range dividend was breaking merge with 1mo-prices, so fixed that. Also replaced the mega-loop with Numpy, much clearer now. Improved its tests.
New logging in `download` stores the tracebacks, but the logic was faulty, this fixes that.
Also improves error handling in `download`.
Unit tests should have detected this so improved them:
- add/improve `download` tests
- disable tests that require Yahoo decryption (because is broken)
- fix logging-related errors
- improve session use