Thursday, February 17, 2011

IBM Watson leads way to Singularity but will have to learn Roboethics and obey Asimov’s 3 (and some) rules of Robotics to behave

Update 2: IBM's Watson site

Update 1:
Here's a YouTube video on how they built Watson:
IBM is already working on applying Watson technologies to Healthcare. From IBM: "... a doctor considering a patient's diagnosis could use Watson's analytics technology, in conjunction with Nuance's voice and clinical language understanding solutions, to rapidly consider all the related texts, reference materials, prior cases, and latest knowledge in journals and medical literature to gain evidence from many more potential sources than previously possible. This could help medical professionals confidently determine the most likely diagnosis and treatment options."

Original post below...

This subject has been beaten to death by now and every geek is smiling with pride that a computer beat two human beings in the Jeopardy! challenge. It is an amazing feet for humans to create such a machine, teach it all the knowledge and teach it how to "listen" to answers and find the best question (yeah – that's how the Jeopardy! is played). Check out Day 2 on YouTube at Alex also shows some of the hardware behind Watson.

This was a massive effort on IBM's part where they spent close to $100M for hardware and software. The hardware specs are impressive - (90 IBM Power 750 servers using 15 terabytes of RAM and 2,880 processor cores taking up almost as much space as was taken by Eniac – the first computer built in 1940s (just to keep things in perspective folks – in 70 years since we have machines which almost act like humans so imagine what next 30 years will bring? Read on – Singularity is almost here!).

The software is even more astonishing! Remember, you have to follow all the rules of Jeopardy! This means that you have to know

  • What are the categories?
  • What's the value of each question?
  • Whose turn is it?
  • How much to wager for the "Double Jeopardy!" question?
  • Should I answer if the other contestants' answers were not correct?
  • How much to wager for the "Final Jeopardy! Question"?
For the most part, these are simple algorithms to implement. The biggest issue is to understand Alex Trebek's spoken instructions, answers and any other jokes etc. to filter them out and create a machine understandable question to which it can go find an answer.

Watson did pretty well with factual questions but when the questions had some correlation (I really cannot explain but here's an example: The category was Computer Keys and the answer was "___ is where the heart is!". The question is "What is Home". Watson's potential matches were totally out of context).

Here's where our ability to "program" a computer to learn context and references falls short. How do you teach that? Well – how do you teach kids such things? Is this simile? Will it understand idioms and figure of speech? How about rhetorical expressions?

Well – those types of advanced "emotional" and "lateral thinking" aspects will take time to program into a computer. However, the time is not too far. Hopefully in my lifetime we will have an event known as Technological Singularity.

As per Wikipedia (

"A Technological singularity is a hypothetical event occurring when technological progress becomes so rapid that it makes the future after the singularity qualitatively different and harder to predict. Many of the most recognized writers on the singularity, such as Vernor Vinge and Ray Kurzweil, define the concept in terms of the technological creation of superintelligence, and allege that a post-singularity world would be unpredictable to humans due to an inability of human beings to imagine the intentions or capabilities of superintelligent entities."

Time magazine has covered this subject in their latest issue well.,8599,2048138,00.html

The gist of it is that by 2045, we will have Technological Singularity (TS) and beyond that we cannot imagine (unless you really stretch your brain muscles!) where the new innovations will take us. Very interesting read – I highly recommend it. However, one of the key points which TIME didn't do enough justice is around Ethics and Morals for the time when we achieve TS and machines become self-aware and start building and innovating in unimaginable ways. It is like raising kids – you have to teach them right and wrong, good and bad, good and evil etc. Presumably our yardstick of good/bad or right/wrong are generally accepted and will be so in future too! But the point is that somebody will have to teach these computers these emotional items so that when we get to TE, we are not staring at SkyNet from Terminator which knowingly or unknowingly tries to eliminate humankind.

This brings me to the Isaac Asimov's Three Laws of Robotics (

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
These are very simple rules and if you program the Robot's OS (rather put these instructions in the CPU itself!) then we should not have to worry about it too much. Now, there will always be cases where a robot may have to "pick between the lesser of the two evils" – meaning making a choice in a situation where it has to figure out possible courses it can take which may ALL hurt humans and then it has to calculate the path of least damage. That is the most painful part for human beings and I am not sure how to teach that to a machine.

Anyways, the Wikipedia article goes on to introduce additional rules including a zeroth rule:

  1. A robot may not harm humanity, or, by inaction, allow humanity to come to harm.
Fourth law:

  1. A robot must establish its identity as a robot in all cases.
And fifth law:

  1. A robot must know it is a robot.
Please read the Wikipedia article for full details on these laws and what types of scenarios various authors came up with which required them to introduce these laws. Very interesting and mind bending reading at times!

So, these laws have evolved into the Roboethics ( and eventually to Ethics of Artificial Intelligence (

Bottom line is that these fields will have to merge and a good reference framework for good vs bad, right vs wrong will have to be defined without being prejudiced about attributes of human beings including religion, sex, shape, size, color. May be we should start teaching these things ourselves first before we can effectively teach these to our kids and eventually to the artificial intelligence in the machines.

Another thought that occurred to me about learning was around how do babies and kids learn? The mechanics of learning involves getting exposed to various stimuli including parents, TV, teachers and now Internet, books, movies, radios, environment and many other such factors. Watson had to be fed entire encyclopedias, knowledgebase and other such knowledge works and it created complex data structure which were efficient for searching etc. We human beings also do similar exercise right from the time we are born. It is just that the breadth and depth of the subjects taught to us are not as wide as what Watson was taught! So, theoretically, you could record ALL stimuli a baby is being exposed to and keep teaching machines from that experience. This knowledge augmented with the established knowledge of humankind (like what was fed to Watson) can be an awesome resource for humans. It is like that Matrix movie where she learns to fly a helicopter in matter of seconds by simply downloading the entire manual and flying instructions!

Gordon Bell of Microsoft is already recording everything he is getting exposed to using video camera and a microphone. Check out - the title says it all – Your Life, Uploaded. The digital way to better memory, health and productivity. So, kids - start early! Record everything on Facebook and Twitter and then we will find a way to tie those learning with what you learn at school and what others learned through the history, geography and other subjects and then you can pass exams easily. Hey – but why do you need an exam then! Well – you will need exams to ensure that you can use this stuff effectively! What's the point if you cannot search the answer to 2x2?

Anyways, in conclusion, I am sure a time will come where we will be able to easily tap into the vast knowledge of humanity using highly contextual searches (natural language search) and that will make our lives better. Just make sure that we teach our kids the right stuff and so that they can program those machines accordingly!

Back to Watson and how it can help us today. They are thinking up many applications where Watson can help. First and foremost is medical field where once fed with all the aspects of human anatomy, symptoms, diseases and medicines etc., you just have to tell that I am having these health issues and it will diagnose the issue accurately and also recommend a course of action. Similar things can be applied for many other fields including insurance, fraud detection and many other fields. I am sure IBM is going to get 10 times the return on the investment they made on Watson!

(Phew! This was a long rambling…).

For more details around how Watson came around and what are some of the implementations behind it, check out following links:

CNN - Behind-the-scenes with IBM's 'Jeopardy!' computer, Watson By John D. Sutter, CNN (February 7, 2011 8:29 a.m. EST)

Also, some good details around algorithms and implementation of DeepQA can be found here:

The actual whitepaper - "Building Watson: An Overview of the DeepQA Project" - is available here:

Yago is one of the knowledge base which was fed to Watson. YAGO ( is a knowledge base developed at the Max-Planck-Institute Saarbr├╝cken. The knowledge base contains information harvested from Wikipedia and linked to Wordnet. It knows more than 2 million entities (like persons, organizations, cities, etc.), and knows 20 million facts about these entities. YAGO has a manually confirmed accuracy of 95%. It can be queried online. The YAGO ontology is licensed under the GNU Free Documentation License.

I imagine that they would have fed it Encyclopedia Britanica along with CIA World Factbook ( and Internet Movie Database - IMDB ( How about the complete catalog of Library of Congress?