Abstract

A system for transducing an input sequence into a target sequence is described. The system includes a sequence transduction neural network for transducing an input sequence having a respective network input at each of a plurality of input positions in an input order into an output sequence having a respective network output at each of a plurality of output positions in an output order. The sequence transduction neural network includes an encoder neural network and a decoder neural network. The encoder neural network is configured to receive the input sequence and generate a respective encoded representation of each of the network inputs in the input sequence. The encoder neural network includes a sequence of one or more encoder subnetworks, in which each encoder subnetwork is configured to receive a respective encoder subnetwork input for each of the plurality of input positions and to generate a respective encoder subnetwork output for each of the plurality of input positions. Each encoder subnetwork includes an encoder localized self-attention module that is configured to receive the subnetwork input for each of the plurality of input positions and, for each particular input position in the input order, the encoder localized self-attention module is configured to apply a localized self-attention mechanism over the encoder subnetwork inputs at input positions within a window of a fixed size of the particular input position to generate a respective output for the particular input position. The decoder neural network is configured to receive the encoded representations and generate the output sequence.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS