DecoderAttention

The DecoderAttention is an instance of the pytorch nn.Module class. This part of the neural network takes the context_vector from the Encoder and produces the attention_vector.

class context_builder.decoders.DecoderAttention(*args: Any, **kwargs: Any)[source]
DecoderAttention.__init__(embedding, context_size, attention_size, num_layers=1, dropout=0.1, bidirectional=False, LSTM=False)[source]

Attention decoder for retrieving attention from context vector.

Parameters:
  • embedding (nn.Embedding) – Embedding layer to use.

  • context_size (int) – Size of context to expect as input.

  • attention_size (int) – Size of attention vector.

  • num_layers (int, default=1) – Number of recurrent layers to use.

  • dropout (float, default=0.1) – Default dropout rate to use.

  • bidirectional (boolean, default=False) – If True, use bidirectional recurrent layer.

  • LSTM (boolean, default=False) – If True, use LSTM instead of GRU.

Forward

The forward() function takes the context_vector and produces the attention_vector. This method is also called from the __call__ method, i.e. when the object is called directly.

DecoderAttention.forward(context_vector, previous_input=None)[source]

Compute attention based on input and hidden state.

Parameters:
  • X (torch.Tensor of shape=(n_samples, embedding_dim)) – Input from which to compute attention

  • hidden (torch.Tensor of shape=(n_samples, hidden_size)) – Context vector from which to compute attention

Returns:

  • attention (torch.Tensor of shape=(n_samples, context_size)) – Computed attention

  • context_vector (torch.Tensor of shape=(n_samples, hidden_size)) – Updated context vector