Monday, November 24, 2008

The Deception Theorem

The following describes something I believe to be correct, but would need some work to make really rigorous.

“The Deception Theorem states that a First-Order Intentional Agent can never know it has been deceived. First we explain the theorem, then we sketch the proof.”

“What’s an Intentional Agent?”

“Put simply, it’s an entity which we can describe in terms of its 'beliefs' and ‘wants’. So when we see an ant dragging a dead aphid back to the nest, we can say that the ant believes that the aphid is a useful ant-resource (food), and wants to get it back to the nest.

"A First Order Intentional Agent - call it a FOIA – is an agent which may be accurately described in terms of its beliefs and wants. What makes it first-order is that the FOIA doesn’t itself take account of any beliefs and wants of intentional agents in its environment. Specifically, an ant doesn’t operate as if it thought that you had any beliefs and/or wants relevant to it. Of course, it could be mistaken about that.”

“So you’re saying insects are FOIAs? What about higher animals, mammals for example?”

“Well, humans are certainly capable of a “theory of mind” – the recognition that other entities out there have points of view (another way of expressing beliefs and wants). People with autism perhaps excepted. I think most of us assume that cats and dogs, for example, operate as if they believe we have beliefs and goals they can manipulate. Hard proof is more difficult!”

“OK, so perhaps now you could define ‘deceive’?

“Right. First we define a Higher-Order Intentional Agent (HOIA). This, as you would suspect, is an entity which can indeed ascribe to other entities beliefs and wants, and which can actively work to alter them.

"Naturally in order to be a deceiver, you have to be a HOIA - a FOIA can’t deceive as it has no concept that other entities have opinions in the first place, so it can’t conceive of manipulating them.

"So if X is a HOIA and Y is any kind of intentional agent, we say as a definition:

X deceives Y if:

  1. Y doesn’t believe something (at some point in time)

  2. X wants Y to come to believe that something, even though it isn’t true

  3. X executes a plan by which Y comes to believe that thing, even though it isn’t true.
Y has now been deceived.

We can say it more clearly in symbols.

X deceives Y iff

  1. ~B(Y, φ) and

  2. W(X,◊[B(Y, φ) and ~ φ]) and

  3. Executes(X, plan(B, φ)) → ◊[B(Y, φ) and ~ φ].
where B=Believes, W=Wants, ◊ = eventually, φ is the false belief. The purpose of X executing the plan is to generate a set of sensory impressions in Y so that Y comes to update its beliefs (falsely) in the way X intended.

Now we can sketch the proof of the Deception Theorem. We suppose that the deceived agent, Y is a FOIA.

1. The FOIA Y believes it is surrounded by objects which behave in essentially self-contained, purely reactive ways. Note that we’re talking about intrinsic behaviour here. If an ant kicks a stone, the direction it moves clearly depends on the ant's opinion as to where the stone should go. But the stone’s behaviour is not caused by the ant’s opinion per se, but by its foot: a well-defined reaction out of physics.

In more technical language, Y has a non-intentional understanding of its external environment.

2. Now Y can certainly make mistakes – it has limited perception, after all. If its subsequent perception indicate it has misread a situation, and fallen into error, it will experience cognitive dissonance. For example, the ant drags the fly but an unobserved sharp stone snags it and drags it out of the ant’s grasp. What a surprise!

In such a situation Y will apply whatever repair mechanism its design specifies: could be repetition, random behaviour, avoidance or something else. Whatever the mechanism, the design intent is to get back on track to securing whatever is now Y’s primary ‘want’.

3. Assume X now tries to deceive Y. The mechanism of deception – according to our definition above - is to provide sensory input which generates belief change in Y in a false direction. It may or may not work, but if it does, to realise you have been fooled you have to accept that there is some agent out there fitting the definition of an agent of deception above.

Specifically, Y has to come to believe that there is a X such that:

X wants Y to come to believe something even though it isn’t true, i.e.

∃X.W(X,◊[B(Y, φ) and ~ φ])

"This is a statement of higher-order intentionality, which a FOIA – by definition – cannot conceptualise.

4. So a First-Order Intentional Agent can never know it has been deceived. QED.”

“Can you give me an example?”

“Consider a simple wasp trap: a jam bottle with a little jam and water in the bottom, and a small hole in the lid. This contraption sends a signal to the wasp 'there’s good food here which is safely obtainable'. The wasps follow the odour trail into the bottle, but can never get out. They drown, deceived, but never know that they have been deceived."

"Well, that's maybe not very convincing. No-one thinks wasps are particularly smart anyhow.”

“They've managed to get by for two hundred million years so they obviously know something. But as another example, you could equally deduce that an autistic person can’t tell when they’re being deceived. Is that a big enough deal for you?”


Professor Michael Wooldridge at Liverpool University is a leader in agent theory here in the UK. For more, see his website here.


NOTE: if you got this far, you may believe the result to be quite trivial. I partially agree with you, although I would claim in defence that the sight-seeing during the journey more than makes up for the attractions of the final destination.

The motivation for writing it is that I need this result for a plot device in my current fiction writing here.