Interpretability by construction: no disentangled reward-hacking surface | Manifund