Can someone spot anything wrong with my LSTM forex model?

Question

The model below reads in data from a csv file (date, open, high, low, close, volume), arranges the data and then builds a LSTM model trying to predict next day's close based on a number of previous days close values.

However, validation accuracy is about 53.8% no matter if i... - change hyperparameters - make it a deep model - uses many more features than just close

To test if I made a simple mistake I generated another data source that was a signal I created from adding sin, cosine and a little noise so that i KNEW a model should be able to be trained and it was. The model below got about 94% validation without any tuning.

With that in mind. How come when I try to use it on actual data (eurusd 1-min data) it doesn't seem to work..?

Anyone that sees an error or can point me in the right direction?

import pandas as pd
import numpy as np

fpath = 'Data/'
fname = 'EURUSD_M1_1'
df = pd.read_csv(fpath + fname + '_clean.csv')
# file contains date, open, high, low, close, volume

# "y" is whether the next period's Close value is higher or lower than current Close value
outlook = 1
df['y'] = df['Close']<df['Close'].shift(-outlook)

# Drop all NAN's
df.dropna(how="any",inplace=True)

# Get X and y. To keep it simple, just use Close
X_df = df['Close']

# "y" is whether the next period's Close value is higher or lower than current Close value
outlook = 1
y_df = df['Close']<df['Close'].shift(-outlook)

# Train/test split
def train_test_split(X_df,y_df,train_perc):
    idx = int(train_perc/100*X_df.shape[0])
    X_train_df = X_df.iloc[0:idx]
    X_test_df = X_df.iloc[idx:]
    y_train_df = y_df.iloc[0:idx]
    y_test_df = y_df.iloc[idx:]
    return X_train_df.as_matrix(), X_test_df.as_matrix(), y_train_df.as_matrix(), y_test_df.as_matrix()

X_train_df, X_test_df, y_train, y_test = train_test_split(X_df,y_df,90)

# Scaling
def scale(X):
    Xmax = max(X)
    Xmin = min(X)
    return (X-Xmin)/(Xmax - Xmin)

X_train_scaled = scale(X_train_df)
X_test_scaled = scale(X_test_df)

# Build the model
import tensorflow as tf
from keras.models import Sequential
from keras.layers import LSTM, Dense

# ### Constants
num_time_steps = 5   # Num of steps in batch (also used for prediction steps into the future)
num_features = 1   # Number of features
num_neurons = 97
num_outputs = 1 # Just one output (True/False), predicted time series
learning_rate = 0.0001 # learning rate, 0.0001 default, but you can play with this
nb_epochs = 10 # how many iterations to go through (training steps), you can play with this
batch_size = 32

# Reshaping
X_train_scaled = np.reshape(X_train_scaled,[-1,1])
nb_samples_train = X_train_scaled.shape[0]  - num_time_steps

X_train_scaled_reshaped = np.zeros((nb_samples_train, num_time_steps, num_features))
y_train_reshaped = np.zeros((nb_samples_train))

for i in range(nb_samples_train):
    y_position = i + num_time_steps
    X_train_scaled_reshaped[i] = X_train_scaled[i:y_position]
    y_train_reshaped[i] = y_train[y_position]

model = Sequential()

stacked = False
if stacked == True:
    model.add(LSTM(num_neurons, return_sequences=True, input_shape=(num_time_steps,num_features), activation='relu', dropout=0.5))
    model.add(LSTM(num_neurons, activation='relu', dropout=0.5))
else:
    model.add(LSTM(num_neurons, input_shape=(num_time_steps,num_features), activation='relu', dropout=0.5))

model.add(Dense(units=num_outputs, activation='sigmoid'))

model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics=['accuracy'])

history = model.fit(X_train_scaled_reshaped,
                    y_train_reshaped,
                    batch_size = batch_size,
                    epochs = nb_epochs,
                    validation_split=0.3)

You may need a finance expert to comment, but one likely and very key problem is that there isn't a good statistical predictive model to be had. LSTMs cannot match patterns where there are none. Having said that, check your input normalisation. Also, add the next day's close value (not $y$, but the numerical value) as a feature just as a sanity check that there is not some numerical issue or bug. — Neil Slater, Jan 20 '18 at 20:43
(Answer instead of comment because of my lack of reputation, sorry) While there is undoubtedly a huge randomness in the market behavior, it is not all random IMHO. Just the very fact there are many players using few famous strategies (think psychological supports/resistances, grid trading, etc.) makes it less than totally random. Exploiting this weak signal in the sea of noise is quite difficult, but on the other hand even a small edge would do, you don't need a perfect prediction. I wouldn't give up so easily, try getting more training data for example. — asdf, Jan 01 '19 at 22:17
I would say try with some bigger time frame, something starting from 15 min bars to 1 day. I beleive trends form in bigger time frames, anything in 1min time frames or less is too chaothic to predict. — Macumbaomuerte, Jan 24 '19 at 08:10

horaceT · Accepted Answer · 2018-01-21T01:10:30.163

What you are up against is this fundamental property of most tradable, liquid financial price series, and that is, they are Brownian Motion. In discrete time, it's also known as random walk.

The most important property of Brownian Motion is that it's memoryless, whose mathematical expression is

$E[p_{t+1} | p_{-{\infty} : t}] = p_t$

Recurrent neural net, particularly the LSTM flavor, is very powerful in capturing and modeling a long memory process. In fact, it was invented to deal with state that depends on itself many time steps ago (that famous LSTM paper was dated 20 yrs ago[1]).

What you've demonstrated using RNN (assuming your code and training are free of bug) is precisely this property. Put it another way, there is no way to beat a fair coin in predicting tomorrow's FX price.

Another perspective on your attempt. If such naive model could accurately predict FX time series, it would have been exploited by the whales in the hedge fund industry long ago and the opportunity would cease to exist.

[1] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.

Can someone spot anything wrong with my LSTM forex model?

1 Answers1