Early Exit & Inference Rewind in 4-16-bit LLMs | Manifund