Emergent Abilities of Large Language Models
Citation
Authors: Jason Wei et al. Year: 2022 Venue: URL:
Abstract
This paper documents and classifies abilities that appear unpredictably with scale—present in larger models but absent in smaller ones.
Summary
Empirical documentation and classification of emergent abilities across prompting paradigms (few-shot, chain-of-thought, instruction-following).
Key Contributions
- Definition of “emergent ability” — cannot be predicted by extrapolating smaller models
- Documentation of emergent abilities across benchmarks
- Identification of phase transitions in capability emergence
- Classification by prompting paradigm
Core Concepts & Definitions
Emergent Ability
An ability is emergent if it is not present in smaller models but is present in larger models. Formally: “cannot be predicted simply by extrapolating the performance of smaller models.”
Phase Transition
Sharp performance increase at critical scale. Distinction between:
- Slow emergence: gradual on linear scale, appears sharp on log scale
- Truly discontinuous transitions
Prompting Paradigms
- Few-shot prompting: In-context learning with exemplars
- Chain-of-thought (CoT): Intermediate reasoning steps before final answer
- Instruction-following: Zero-shot task completion from natural language
Main Results
- emergence is task-dependent: some tasks show smooth scaling, others sharp transitions
- Documents emergence across: few-shot, CoT, instruction following, task composition
- Raises questions about future capabilities with continued scaling
Relevance to Project
High — Foundational empirical work:
- Establishes the phenomenon our theory aims to explain
- Prompting paradigms relate to skill elicitation methods
- Phase transitions connect to Michaud’s monogenic/polygenic distinction