DecoderAttention
The DecoderAttention is an instance of the pytorch nn.Module class.
This part of the neural network takes the context_vector
from the Encoder and produces the attention_vector
.
- DecoderAttention.__init__(embedding, context_size, attention_size, num_layers=1, dropout=0.1, bidirectional=False, LSTM=False)[source]
Attention decoder for retrieving attention from context vector.
- Parameters:
embedding (nn.Embedding) – Embedding layer to use.
context_size (int) – Size of context to expect as input.
attention_size (int) – Size of attention vector.
num_layers (int, default=1) – Number of recurrent layers to use.
dropout (float, default=0.1) – Default dropout rate to use.
bidirectional (boolean, default=False) – If True, use bidirectional recurrent layer.
LSTM (boolean, default=False) – If True, use LSTM instead of GRU.
Forward
The forward()
function takes the context_vector
and produces the attention_vector
.
This method is also called from the __call__
method, i.e. when the object is called directly.
- DecoderAttention.forward(context_vector, previous_input=None)[source]
Compute attention based on input and hidden state.
- Parameters:
X (torch.Tensor of shape=(n_samples, embedding_dim)) – Input from which to compute attention
hidden (torch.Tensor of shape=(n_samples, hidden_size)) – Context vector from which to compute attention
- Returns:
attention (torch.Tensor of shape=(n_samples, context_size)) – Computed attention
context_vector (torch.Tensor of shape=(n_samples, hidden_size)) – Updated context vector