Modelling Correlation between multiple interelated time-series

Problem Definition:

Model the inter-relation between stocks and predict stock next price prediction given min-by-min data on stock prices.

Stock values are correlated, events in one stock will give information about events in other stock, so on and so forth. These 2nd order and 3rd order relation can be seen in the historical stock prices. Ofcourse, the situation is further complecated because there are global events that affect stock prices as well and which may cause these 2nd and 3rd order effects from playing out.

Example of 2nd order correlation – Sales of Cadbury being indicative of recession.

Can you model the stock price to capture short term and long term dependancies with other stocks?

Modelling:

Handling data:

We have Min by Min data over several days.

  1. We should also capture volume trade, number of stocks, bought and sold information.
  2. For each stock, we can bunch together a segment of 30 mins together, to make sure we are not modelling noise. This will also help us handle interrelation between larger sequences.
  3. For each stock, we can create an encoding of the events in a segment. The encoding will be a dense vector representation that will describe the events that happened in that time segment for the stock.
  4. Day breaks should also be represented in the data sequence as a spcl token. Similarly, weekend breaks, spcl events can have their own tokens similarly.
  5. Within a day each segment will have a position assosciated with it, we will model this position using positional embeddings. Start of the day may have position 0, etc.

Modelling:

Will attempt to make a simple model here.

  • Make an encoding of each stock. This can be a separate model, that tries to model the characterestics of the stocks and puts it into an embedding –

    1. We can capture the industry in which the company operates in.
    2. Market cap – the kind of investors the company has
    3. Performance based analysis of the company.
  • Each stock will be modelled with it’s own sequence model, here i am using a traditional RNN model for ease of understanding. We could look at transformers as well –

    1. The sequence model will be responsible for generating a hidden state respresentation, a long term context for the stock and a short term context vector, the context vector should be able to summarize the time sequence for the company.
    2. The sequence model will get as input the company embedding, the position embedding for the time segment, it’s own context vectors, and a attention vector generated from other context vectors.
    3. The hidden state can be used to make a prediction of the next price segment, a long term price predictor. It should be able to suggest the confidence of prediction as well.

Other approaches:

For modelling correlation between multiple time series:


Want to connect? Reach out @varuntul22.