I have a dataset with financial stock data, some of the features are shared, for example daily gold prices, while the stock price for each individual stock is different, the gold price would be the same for everybody that day.
When I split 80/10/10 randomly, it's "cheating" and while the result accuracy is great the actual real world live result is bad.
When I split sequentially, ie first 8 years of data in training, next year in validation, last year in testing. The result accuracy is bad, and live testing is also bad.
What I want to ask is, should I do random split between just training and validation on first 9 years of data, then do testing on last year of data separately?
OR is sequentially as good as it's gonna get and I simply can't predict the future?