Music Transformer's Improved Relative Positional Encoding
This post contains my notes about the improved relative positional encoding method from the paper “Music Transformer” (Huang et al., 2018). References to “the paper” will refer to the aforementioned one, unless otherwise stated. This assumes knowledge of the original Transformer and relative positional encoding. The improved relative positional encoding method proposed by the paper is a significant contribution, as it reduces the intermediate memory requirement of relative positional encodings in self-attention (as in “Self-Attention with Relative Position Representations” (Shaw et al....