[ weird things ] | why a search engine is playing a trivia game

why a search engine is playing a trivia game

Watson can easily win a trivia contest, but it needs to play anyway to learn how to talk to humans.
watson jeopardy

If you were watching Jeopardy last night, you probably saw a machine wipe the floor with two humans with an exceptional grasp of random trivia, answering a wide assortment of questions and seldom getting it wrong or missing a beat. You could almost say that it was performing at a virtually superhuman level, buzzing in faster than the men between which the screen displaying its avatar stood. And that’s precisely the point. Watson is a search engine which can understand natural language and deconstruct it into keywords and commands. The entire purpose of this cluster of 90 fine-tuned servers is to answer questions, and it probably probably had at least a few terabytes of reference data from which it could pick an answer. Basically, it’s a self-contained trivia encyclopedia and putting humans against it is like asking them to go against an optimized Google search. It will win just because it needs milliseconds to retrieve the information and just a few more milliseconds to hit its buzzer. It’s going to dominate any trivia contest similar to Jeopardy. But that’s not what’s being tested. IBM is interested in its interaction with humans, which is why it went out of its way to present it as a real entity.

Watson is an implementation of IBM’s DeepQA project, an attempt to create a search engine which can listen to normal, everyday human language, understand what it’s being asked to find, and retrieve an answer to the query quickly and efficiently. It takes a lot of processing oomph because just like us, it has to filter what it can pick up with its microphones into keywords which will ultimately define what it will be doing. Those keywords can come in many varieties, be grammatically incorrect, and deal with very specialized realms of knowledge, and they have to be carefully arranged into the correct final query that will be used to search the data source to which the computer is hooked up. Compared to the computing power required to do that in real time, actually answering a question about one of the works of an obscure painter who lived a few centuries ago is, no pun intended, trivial. But a game show is probably the best way to objectively test a system like that because there are very simple metrics by which you can judge its performance. Answers are either right or wrong, and to win, you will generally need to answer more questions correctly. And while Watson did strategize with its wagers, the fact of the matter is that it did answer the most questions, it answered them with a notably high degree of accuracy, and it showed itself as a very good and capable natural language processor.

So where does a system like Watson go from here? Well, considering that it’s basically a server farm, it’s not exactly going to be affordable for the average user anytime soon and will be restricted to researchers in high budget labs. You could say that we already do have something like Watson for the typical search query with a tool like Wolfram Alpha, but there’s a fundamental difference between the two. Whereas Alpha gushes with a stream of information and tries to perform elaborate correlations and calculations, Watson is intended to give just one correct answer to a specific question and it’s designed to do that just by listening to you talk. Alpha’s prowess is limited to what and how well you type, and according to its creator, it’s less than 90% effective in computing its answers, while Watson seems to have a much higher accuracy. It’s probably possible for it and Alpha to carry out similar functions and I’m sure that Watson’s algorithms could work just as well on typed text as they do with spoken words, and putting it behind the scenes of a search engine is something that we may see in the very near future. According to IBM, things like tracking down medical information and relevant facts and figures lost in vast databases would be the prime candidates for Watson’s applications in the real world, but with its size and the costs of running and maintaining it would make it far too expensive for anyone but the companies which operate the world’s biggest data centers. Unless Watson could be miniaturized that is…

# tech // artificial intelligence / cognitive computing / computer science


  Show Comments