I am trying to understand why induction up to exactly $\epsilon_0$ is necessary to prove the cut-elimination theorem for first-order Peano Arithmetic; or, as I understand, equivalently, why the length of a PA-proof with all cuts eliminated grows (in the worst case) as fast as $f_{\epsilon_0}$ in the fast-growing hierarchy.

I can understand why use of an induction axiom corresponds to ordinal multiplication by $\omega$. As induction axioms are written in the sequent calculus, as a premise we have a proof of $\phi(x)\vdash\phi(Sx)$ for a free variable $x$, and as a conclusion we get $\phi(0)\vdash\phi(y)$ for any term $y$.

Then if the proof of $\phi(x)\vdash\phi(Sx)$ has the ordinal $\omega^\alpha$, the proof with the conclusion $\phi(0)\vdash\phi(y)$ is assigned the ordinal $\omega^{\alpha+1}$. E.g. if the proof of the induction step had ordinal $\omega^0 = 1$, then the conclusion $\phi(y)$ has the ordinal $\omega^1 = \omega$.

This part makes perfect sense to me, because if I wanted to eliminate the use of an induction rule (CJ sequent) I might just start with $\phi(0)$ and repeatedly go through the part of the proof that I used to prove $\phi(x)\vdash\phi(Sx)$. Say, I'd repeat the sub-proof of the induction step 17 times, e.g. with $\phi(14)\vdash\phi(15)$ and so on, until I got to $\phi(17)$, or whichever number I wanted to prove $\phi$ about.

I don't quite understand how this corresponds to eliminating *cuts*, per se (it seems to create a host of new cuts, in fact). But it is still very *intuitive* to me that Peano Arithmetic's power to invoke the induction axiom would correspond to multiplication by $\omega$. If we take a proof in PA that ends with a single use of the induction axiom to get $\phi(y)$, and we want to translate it into a proof in an arithmetic theory that doesn't have induction, then we might have to multiply the PA-proof-length by any finite number (e.g. multiply it by 17) to get the proof length in the inductionless theory.

Repeated multiplication by $\omega$ only takes us up to $\omega^\omega$ as a proof-theoretic ordinal, though.

According to Gentzen, when we use cuts on formulas involving quantifiers, e.g. $\forall p:\exists q:\psi(p,q)$, this corresponds to ordinal exponentiation. If the proof above the cut has ordinal $\alpha$ and we cut a formula with one quantifier, then the proof below the cut has ordinal $\omega^\alpha$. If we cut on two quantifiers, the proof below the cut has ordinal $\omega^{\omega^\alpha}$. This fits with other things I've heard about how PA using formulas with only N quantifiers can be proven consistent by PA using only N+1 quantifiers. (As a side issue I'd be interested in knowing how you use N quantifiers to prove wellfoundedness of an ordinal notation for a stack of $\omega$s N layers high.)

What I don't understand is why the ability to use quantifiers corresponds to ordinal exponentiation. I can guess in a vague sense that if we have a proof using $\forall p:\phi(p)\vdash\phi(17)$, and we want to eliminate the use of $\forall p:\phi(p)$, then we need to repeat that part of the proof each time we want to prove $\phi(17)$, $\phi(q)$, and so on. But this would again just imply that we needed to repeat that part of a proof a finite number of times, and iterating this takes us up to only $\omega^\omega$, not the desired $\epsilon_0$, so I must be missing something.

I asked a friend to help with this and she's read through Gentzen's relevant papers, and she's shown me the relevant parts, and I'd previously checked several standard texts on proof theory and Googled around, and she's also shopped the question around the logic department of a major university, and we still don't know any answer to this except "because Gentzen says to use ordinal exponentiation".

We also can't find any examples of cut-elimination being carried out on a proof with cut on a quantified formula, which shows how the size of the resulting cut-free proof could grow faster than $f_{\omega^\omega}$. An example like this for some particular proof would be *very* helpful, even if, of necessity, the repeated steps for eliminating the cut are sketched more than shown. I can understand why the length of a Kirby-Paris hydra game grows at the same rate as the Goodstein sequence, and visualize both processes insofar as a human being possibly can. I cannot visualize why the length of a PA-proof heading for cut-freeness would grow at that same rate as cuts were repeatedly eliminated. (Mapping the process onto a Kirby-Paris hydra game of starting height at least 3 would answer the question!)

no greater than$\epsilon_0$. $\endgroup$