1 article
New methods internalize outcome feedback into step-level guidance, expanding reasoning beyond chain-of-thought limitations.