Scaling Laws
Definition
Scaling laws are empirical power-law relationships between model scale (parameters, data, compute) and performance (loss).
In the Literature
Chinchilla Law
Where:
- = cross-entropy loss
- = parameters
- = training tokens
General Form (Ganguli et al.)
- Compute:
- Data:
- Parameters:
Michaud’s Quantization Explanation
If quanta frequencies follow Zipf’s law :
- Parameter scaling:
- Data scaling:
Arora & Goyal’s Key Insight
10× scaling ≈ 2× increase in composable skill count
When loss drops from to , performance on -tuples equals previous single-skill performance.
In This Project
Scaling laws constrain our framework:
- Determine which skills in are “realized” at given scale
- Set the emergence boundary
- Explain why ontological-expansion is scale-dependent
Related Concepts
- emergence — What scaling laws predict
- skills — Units whose competence scales
- ontological-expansion — Scale-dependent ontology growth