Scaling Laws

Definition

Scaling laws are empirical power-law relationships between model scale (parameters, data, compute) and performance (loss).

In the Literature

Chinchilla Law

Where:

  • = cross-entropy loss
  • = parameters
  • = training tokens

General Form (Ganguli et al.)

  • Compute:
  • Data:
  • Parameters:

Michaud’s Quantization Explanation

If quanta frequencies follow Zipf’s law :

  • Parameter scaling:
  • Data scaling:

Arora & Goyal’s Key Insight

10× scaling ≈ 2× increase in composable skill count

When loss drops from to , performance on -tuples equals previous single-skill performance.

In This Project

Scaling laws constrain our framework:

  • Determine which skills in are “realized” at given scale
  • Set the emergence boundary
  • Explain why ontological-expansion is scale-dependent

References