The ideas within it don't make sense to me, the proposal seems very light on detail of what you would actually be doing, and where there is detail I find myself more confused than I started. Where you describe the protocol of this proposal, you say that you will be testing for AI self awareness though an inverted challenge (?) where if the AI puts significant effort into to solving the challenge (what challenge?) it fails and also if it does not pass, it fails (isn't that redundant)? And doing so somehow forces alignment?