Friday, May 18, 2018

AI: inference and causality

Franz Kafka statue in Prague

When Google Lens tells me the picture above is the Statue of Franz Kafka in Prague, glossed by Wikipedia as:
"The Statue of Franz Kafka is an outdoor 2003 sculpture by Jaroslav Róna, installed on Vězeňská street in Prague, Czech Republic. It is based on a scene in Franz Kafka's first novel, Amerika, in which a political candidate is held on the shoulders of a giant man during a campaign rally, and carried through the streets,"
Google's app is doing something really complex, leveraging artificial neural nets trained by massive datasets. But it's fundamentally inference:
The world we live in |= the pixel map of the photo and Google Len's summary text.
the pixel map of the photo |- Google Len's summary text.
Satisfiability and entailment.

All AI systems need to map their sensor/effector primary data to internal representations which allow inference (deductive, inductive, abductive etc) - regardless of the engineering mechanisms they adopt. Neural nets training their weights are optimising the probability of valid inferences about the world.


Judea Pearl has a new book out arguing for the introduction of causality into AI systems. In a recent Quanta interview he said:
"All the impressive achievements of deep learning amount to just curve fitting."
Here's the book.

Amazon link

Causality is one of those constructs like Free Will, Consciousness and the intentional stance which don't exist in the underlying physics (the theories which the universe satisfies as far as we can tell), but which are emergent in a world of self-aware agents.

They usefully describe relationships between belief-and-goal-driven entities; they succinctly encode the effects of the second law of thermodynamics plus boundary conditions. We use these concepts .. and AI will have to if we are ever to create socially-competent artificial agents.

What Pearl is really asking for is an AI which utilises the intentional stance (specifically including reasoning about cause and effect).


To find out more about Judea Pearl's work (without buying the book!) view the recommended slides at his website. For a review of Pearl's substantive contribution to causality theory, see here.

My own subjective response? I find his dense notation and cluttered semantics worthy but unexciting.


Update: my further impressionistic, superficial and under-researched thoughts.

The difference between a mere association between P and Q (which could be a spurious correlation, or the result of an independent cause of both) and a causation, P causes Q, is captured by the modal operator of necessity []. See here for a detailed discussion.

To check P causes Q we need to check: [](P → Q) and [](¬Q → ¬P).

So we're in possible world semantics and we look to 'neighbouring worlds' to check the truth status of P and Q. But we have the usual problem that such worlds are way too 'big', too full of irrelevancies.

So like Situation Semantics, Pearl takes the engineering approach of restricting his worlds to just those entities and actions which seem relevant to the causality under investigation. These are his causal diagrams which he intends to counterfactually 'mutilate'.

A philosophical strategy similar to the modal analysis of epistemics etc .. and of similar utility.


  1. I think that we should agree with Pearle's distinction between the two forms of AI. In a recent AI debate I have just watched, it was remarked that ANN systems take 750 (training) photos to recognise a cat. Clearly not a model of human (cat) learning...

    The issue of causality in contemporary physics could be sharpened a little: In GR there are definitions of e.g. "Local Causality" and related topics which are clearly mathematical and useful for developing the theory. At the moment these causality topics remain valid, but presumably subject to continual experimental validation.

    Quantum Mechanics has introduced a probabilistic theory which eschews causality in the GR sense, and there has been observational evidence of long range entanglement (Hanbury-Brown and Twiss effect.) Also, as Lubos writes, the off-shell paths in path integral calculations, need to be included to get the right results in QFT. Yet we still have macroscopic causality at least, it seems.

    Clearly the tension between these two theories is evident with the topic of causality.

    On the Intentional Stance, I wonder whether it can be separated from the "Weak AI" assumption also made by Dennett, which conflates two maybe distinct topics.

    All in "Cause and Effect Reasoning" seems like a form of classical physics reasoning (naive physics) extended to the wider social world. This will be useful for social robots, but robots expected to work in a quantum world will need to understand Schrodinger's Equation (or its descendants) and their applications.

    1. Pure data-driven blank-slate issues (750 photos to get 'cat') are indeed well-rehearsed in the literature. I don't think Pearl's approach is the only, or even optimal, game in town. I'm more in sympathy with the view that AI systems simply lack practical, experiential knowledge of the world, something humans acquire only through lengthy situated, embodied experiences and incremental training.

    2. I have looked at the Slides now, and yes this is in the analytical tradition of AI, with emphasis on the Causal aspects, with lots of probability and probability based inference. There is the old question here as to how fundamental such probability based approaches are to AI Foundations - you might argue that an "absolute" Inference mechanism has more to do with Foundations. Pearle (and many others in the "Data Centric" view of AI) might argue that Probability gives results.

    3. Like most philosophical approaches which are rooted in concept analysis of 'ordinary language' usage, I don't think Pearl's (et al.) approach is foundational at all. By reifying 'causality', it cuts it off from any deeper analysis. If there is one thing physics tells us, it's that while reality is structured and not chaotic, the concept of causality adds nothing further, simply being an epiphenomenal metalanguage for beings such as ourselves.

    4. I still think that "causality adds nothing further" understates its current value in physics. QM paths admittedly introduce probability into physics, but the most likely path is the classical path, thus the classically causal one. The wider "problem of time" remains unresolved (especially in Quantum Cosmology), but its resolution may provide a deeper explanation for physics causality, and when (if ever) to expect its violation.

    5. Time reversal completely undermines causality. GR is time symmetrical ("To put it in more formal terminology, the field equations of GR are form-invariant under diffeomorphisms, and time-reversal is a diffeomorphism. Therefore any solution is also a solution under time-reversal. Reference"). Newtonian mechanics is time reversible. For QM there's CPT invariance.

      I submit that causality is really about 'free will' and the belief that the causative agent could have done other - that's where the (necessary) modal counterfactuals enter the story.

    6. "Time reversal asymmetry" (ie the observed time symmetry break giving one, but not both, solutions observed) is part of the "problem of time". Thus the challenge is to "explain" this phenomenon: either by directly observing events - or parts of the Universe - running in reverse time; or by excluding time reversal (the time reversal solutions) from the wider theory.

      A roughly analogous situation exists with the essentially complete lack of antimatter in the Universe: these basic theories predict that 50% of material should be antimatter. So again either find that missing "Anti-Universe" or find a further reason as to why it cannot (Cosmologically) exist.

      For reverse time the usual suggestions are: (1) Thermodynamics (applied Cosmologically); (2) T violation (= CP violation under CPT invariance) in some elementary particles (this also might relate to the antimatter problem too) - but this latter observation seems too weak to explain everything, but it is an indication that there might yet be a physics explanation - even a particle physics one.

      One of the obvious problems with the "successful" General Relativity is that it gives far too many mathematical solutions - for example the Godel closed timelike curve solution, and lots of others which are not observed: should these be excluded from the theory somehow - making their "physical properties" merely artefacts - or do they "observably exist" somewhere - making their "physical properties" (e.g. CTCs, naked singularities, etc) actually real?

      Issues of determinism (or otherwise) even enter into GR - usually via the singularities. Certain types of singularity means that (in regions of the Universe) the future is not determined by the past in the way one might have expected from a merely gravitational theory.

    7. Perhaps you saw the Quanta story, Lubos liked it :-).

  2. This is interesting and indeed relates to my previous remark in the ongoing research into that area of GR. (I wonder whether the C0 aspect of that paper relates in any way to a remark I made on your blog about differential equations recently?)

    Hopefully in a future blog you will be able to summarise this area clarifying: Strong Cosmic Censorship, Weak Cosmic Censorship; Chronology Protection; Non-determinism and Cauchy Horizons...


Comments are moderated. Keep it polite and no gratuitous links to your business website - we're not a billboard here.