DecoderAttention

The DecoderAttention is an instance of the pytorch nn.Module class. This part of the neural network takes the context_vector from the Encoder and produces the attention_vector.

class context_builder.decoders.DecoderAttention(*args: Any, **kwargs: Any)[source]

DecoderAttention.__init__(embedding, context_size, attention_size, num_layers=1, dropout=0.1, bidirectional=False, LSTM=False)[source]

Attention decoder for retrieving attention from context vector.

Parameters:

embedding (nn.Embedding) – Embedding layer to use.
context_size (int) – Size of context to expect as input.
attention_size (int) – Size of attention vector.
num_layers (int, default=1) – Number of recurrent layers to use.
dropout (float, default=0.1) – Default dropout rate to use.
bidirectional (boolean, default=False) – If True, use bidirectional recurrent layer.
LSTM (boolean, default=False) – If True, use LSTM instead of GRU.

Forward

The forward() function takes the context_vector and produces the attention_vector. This method is also called from the __call__ method, i.e. when the object is called directly.

DecoderAttention.forward(context_vector, previous_input=None)[source]

Compute attention based on input and hidden state.

Parameters:

X (torch.Tensor of shape=(n_samples, embedding_dim)) – Input from which to compute attention
hidden (torch.Tensor of shape=(n_samples, hidden_size)) – Context vector from which to compute attention

Returns:

attention (torch.Tensor of shape=(n_samples, context_size)) – Computed attention
context_vector (torch.Tensor of shape=(n_samples, hidden_size)) – Updated context vector