Why Sequence Models?

In speech recognition you are given an input audio clip X and asked to map it to a text transcript Y. Both the input and the output here are sequence data, because X is an audio clip and so that plays out over time and Y, the output, is a sequence of words.

Untitled

Notation

Motivating Example

Let say you want a sequence model to automatically tell you where are the peoples names in this sentence. So, this is a problem called Named-entity recognition and this is used by search engines for example, to index all of say the last 24 hours news of all the people mentioned in the news articles so that they can index them appropriately

$x^{(i)<t>}$:= The i-th training example and t-th element of the sequence

$y^{(i)<t>}$ := The i-th output sequence training example and t-th element of the sequence

$T^{(i)}_{x}$:= Input Sequence length of the i-th training example

$T^{(i)}_{y}$:= Output Sequence length of the i-th training example

Representing Words

Vocabulary: List of the words that we use on the representations
The length of the vocabulary varies between 50,000 and millions of words

One-hot representation