1 Storing Data

  • Consider a data set of words

      'one', 'two', 'three', 'four', 'five'
    
  • The first step, is to join the words together
  • But a delimeter such as a full stop ‘.’ is required so we know the words are split

      one.two.three.four.five
    

2 Construct Neural Network

  • A strategy to convert the data into a neural network is to predict a word based on three existing words in the dataset

2.1 Layers:

Three layers can be implemented. The idea is to make sure each word is assessed using the context of the previous word. This functionality of each layer is summarised in the following table:

Layer Level Function
First Only uses the first word as input. It is the embedded layer, that translates input word to hidden word.
Second Embedded by second word and uses output of first layer. Provides connection from hidden datapoint to another hidden datapoint.
Third Embedded by third word and uses output of second layer. Outputs the fourth word with an associated probability.

Another important feature of the algorithm is the implementation of a universal weight matrix for all three layers. This ensures the same three words arranged differently yields the same predicted word.

The visual depiction of this network is shown below:

landmodel

Source: FastAI Book Chapter 12