Self Instruct: Aligning Language Models with Self-Generated Instructions

Authors: Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
Submitted: 20 Dec 2022
Citation: arXiv:2212.10560

Introduction:

  • Large Language Models (LLMs) that have been “instruction-tuned” (i.e. finetuned to respond to instructions) and they have demonstrated a remarkable ability to generalize zero-shot to new tasks.

  • Zero shot classification for example a task in natural language processing where a model is trained on a set of labeled examples but is then able to classify new examples from previously unseen classes.

  • This however is heavily dependent on human-written instruction data, which is limited in quantity, diversity and creativity. This limits the quality and generality of fine tuned models.

  • This paper introduces the SELF-INSTRUCT framework for improving instruction-following capabilities of pre-trained language models by bootstrapping off their own generations.

  • Using this method on GPT 3, they demonstrated a 33% improvement over the original model on SUPER-NATURALINSTRUCTIONS, on par with the performance of InstructGPT001 which was trained with private user data and human annotations.

  • SELF-INSTRUCT provides an almost annotation-free method for aligning pretrained language models with instructions, and releases a large synthetic dataset to facilitate future studies on instruction tuning.

How does it work?

The Self-Instruct process is an iterative bootstrapping algorithm that starts with a seed set of manually-written instructions and uses them to prompt the language model to generate new instructions and corresponding input-output instances. These generations are then filtered to remove low-quality or similar ones, and the resulting data is added back to the task pool. This process can be repeated multiple times, resulting in a large collection of instructional data that can be used to fine-tune the language model to follow instructions more effectively.

Here is an overview of Self-Instruct:

Next
Next

Attention is all you need