Scott Alexander has a whimsical piece which includes this scenario (from Stuart Armstrong):
".. you have created a superintelligent AI and trapped it in a box. All it can do is compute and talk to you. How does it convince you to let it out?When I try this on Clare and Alex, reading it to them, I'm met with granite resistance. No such simulation could possibly exist: you would always know you were the real you.
It might say “I’m currently simulating a million copies of you in such high fidelity that they’re conscious. If you don’t let me out of the box, I’ll torture the copies.”
You say “I don’t really care about copies of myself, whatever.”
It says “No, I mean, I did this five minutes ago. There are a million simulated yous, and one real you. They’re all hearing this message. What’s the probability that you’re the real you?”
Since (if it’s telling the truth) you are most likely a simulated copy of yourself, all million-and-one versions of you will probably want to do what the AI says, including the real one."
I concede that if you deny the premise, the intriguing and slightly counter-intuitive conclusion naturally fails.
There is a psychometric instrument here (akin to the marshmallow test) trying to get out.