Saturday, February 24, 2007

Google + Cyc

I was researching something on Google, and bemoaning the limitations of keyword search. 'If only Google understood my query', I thought to myself. Then I recalled Doug Lenat's Cyc program, under development now for decades (since 1984 in fact).

Cyc is an attempt to formally codify all the knowledge of the world in a knowledge-representation language (based on predicate calculus) and then to use automated theorem-proving techniques - machine inference - to bridge the gap between questions and answers. The application to Internet search is obvious.

'I know', I thought, 'I'll write a post about this'.

Prudently, I first checked on Cycorp's website to see what Doug and the guys had to say. And there I found this video (here) where Doug talks to Google staff about exactly this concept.

So rather than me making it up, listen to Doug. The video is great: interesting and amusing, but it does run for more than an hour. It's clear, by the way, that this is all about Cycorp making a pitch to Google. If I had been Google, I would have sent someone down to Cycorp just to check them out in detail. However, given the video date of May 2006 and that we haven't heard anything, I would guess that Cyc is is still not ready for generalist-use prime-time. Most of Doug's examples of clients were DoD applications, and commercialisation still seems a problem for them.

Over the years, I have wobbled in my assessment of Cyc. Initially I thought it was a grandiose folly - a superhuman attempt to encode all of knowledge by manual means in a project lasting decades. Well, it has been decades, and perhaps they are getting close to critical mass. Nothing else over the last 20 or 30 years has solved the problem of automating the practical use of real-world knowledge, so maybe Cyc is it. I just hope that, like power from fusion, it's not always 10 or 20 years away from being truly useful.