Self Instruct: Aligning Language Models with Self-Generated Instructions
In this post we review the paper “Self-Instruct” with introduces a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations.
Attention is all you need
In this post we will review "Attention Is All You Need" a ground-breaking paper that introduced the Transformer architecture, a neural network model for NLP tasks that relies on attention mechanisms to process input sequences. The paper's contributions have had a significant impact on the field of deep learning and have inspired further research and advancements in the field.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
In this post we review the paper “BERT” (Bidirectional Encoder Representations from Transformers), which introduces a new language representation model. It pre-trains deep bidirectional representations from unlabelled text by jointly conditioning on both left and right context in all layers.
Neural Machine Translation by Jointly Learning to Align and Translate
In this post we review the paper “Neural Machine Translation” introduces attention which suggest a way of enhancing encoder-decoder architectures. It argues that current traditional encoder-decoder architectures are bottlenecked in performance by using a fixed-length vectors.