Emergent Abilities of Large Language Models

Citation

Authors: Jason Wei et al. Year: 2022 Venue: URL:

Abstract

This paper documents and classifies abilities that appear unpredictably with scale—present in larger models but absent in smaller ones.

Summary

Empirical documentation and classification of emergent abilities across prompting paradigms (few-shot, chain-of-thought, instruction-following).

Key Contributions

  1. Definition of “emergent ability” — cannot be predicted by extrapolating smaller models
  2. Documentation of emergent abilities across benchmarks
  3. Identification of phase transitions in capability emergence
  4. Classification by prompting paradigm

Core Concepts & Definitions

Emergent Ability

An ability is emergent if it is not present in smaller models but is present in larger models. Formally: “cannot be predicted simply by extrapolating the performance of smaller models.”

Phase Transition

Sharp performance increase at critical scale. Distinction between:

  • Slow emergence: gradual on linear scale, appears sharp on log scale
  • Truly discontinuous transitions

Prompting Paradigms

  • Few-shot prompting: In-context learning with exemplars
  • Chain-of-thought (CoT): Intermediate reasoning steps before final answer
  • Instruction-following: Zero-shot task completion from natural language

Main Results

  1. emergence is task-dependent: some tasks show smooth scaling, others sharp transitions
  2. Documents emergence across: few-shot, CoT, instruction following, task composition
  3. Raises questions about future capabilities with continued scaling

Relevance to Project

High — Foundational empirical work:

  • Establishes the phenomenon our theory aims to explain
  • Prompting paradigms relate to skill elicitation methods
  • Phase transitions connect to Michaud’s monogenic/polygenic distinction