Help
English 日本語 Русский Svenska


Question Answering

Automatic Question Answering typically falls in the field of Computer Science and Natural Language Processing and the current state-of-the-art aims to answer simple factual questions that require short, fairly unambiguous answers such as:

How high is Mount Everest?
Who is the current President of the United States?
When did the Kursk sink?

The demonstration question answering system you see on these pages is the result of an on-going research project by Ed Whittaker in the Furui Laboratory at Tokyo Institute of Technology. In contrast to many other systems which are often rule-based system tuned and perfected over many many man-years, we adopt a purely statistical, data-driven and non-linguistic approach to the problem of question answering. Note that many other systems also use statistical components but they also tend to use various degrees of linguistic processing to extract features at the surface-level such as part-of-speech information to deeper-levels such as parse trees of the questions. Our system remains rooted in surface features and in particular the words themselves; currently we don't even use part-of-speech information. (For the Japanese system however we did have to compromise and use a rule-based morphological tagger in order to separate continuous character sequences into word-like units; we are currently investigating ways of making such segmentation necessary for languages such as Japanese and Chinese.) We get round the need for extensive linguistic processing by using large amounts of data instead. This is often called data redundancy. Rather than converting the question in to a form that might be observed in the data, we use a large amount of data in the hope that somewhere there is some text in a form that more-or-less matches our question and allows us to extract the answer. We still have to surmount the problem in terms of identifying that kind of answer is expected (e.g. a name for a "Who..." question, and a date for a "When..." question...) but the occurrence in many documents of the correct answer greatly aids this process.

Automatic Statistical Question Answering

Other Languages

As we said above, the advantage of this data-driven, statistical approach is that it's pretty easy to extend it to new languages. In fact, all we need is a set of questions and their corresponding answers and access to a fairly large amount of text data in that language (usually the web suffices). For several languages such question and answer examples can easily be found, for example: English, Russian, European languages etc...)

Anyway, suffice to say, we're continuously working on other languages and you'll find them on the main page eventually. In the meantime if you know of any question-and-answer databases for languages that are not yet covered, please let us know.

Speech Interface

Ok, you might think it's getting a bit ahead of ourselves to combine a question answering system that doesn't work perfectly with a speech recognition system that doesn't exactly work perfectly, but that's what we're going to do! Heh, that's what research is about. And besides a speech interface to a question answering system is a far better interface (in theory) than keyboard input. When we get it up and running you'll find a link to it here.

Mobile Interface

Access for mobile phones is provided through the system's address http://asked.jp. A browser detection script aims to provide a faster-loading minimalist interface to the system. If you get the page designed for a larger browser you can force the browser to the correct start page by entering: http://asked.jp/edw/i/index.html or http://asked.jp/edw/i/index_j.html.

Publications

  • Rapid Development of Multiple Web-based Monolingual Question Answering Systems
    E.W.D. Whittaker, J. Hamonic, T. Klingberg, D. Yang, S. Furui
    Accepted to the 28th European Conference on Information Retrieval, April 2006.
    pdf, ps


  • A Unified Approach to Japanese and English Question Answering
    E.W.D. Whittaker, J. Hamonic and S. Furui
    In Proceedings of the 5th NTCIR Workshop, December 2005
    pdf, ps


  • TREC 2005 Question Answering Experiments at Tokyo Institute of Technology
    E.W.D. Whittaker, P. Chatain, S. Furui and D. Klakow
    In Proceedings of the Fourteenth Text Retrieval Conference (TREC), November 2005.
    pdf, ps


  • A Statistical Pattern Recognition Approach to Question Answering Using Web Data
    E.W.D. Whittaker, S. Furui and D. Klakow
    In Proceedings of Cyberworlds, November 2005.
    pdf, ps

    Other QA Systems on the Web

    The following are a couple of QA systems that are available on the web so you can get an idea of what the "state-of-the-art" is (each link will open in new window).

  • AnswerBus
  • BrainBoost
  • Language Computer Corporation
  • START (MIT)
  • Arizona State University system (Dmitri Roussinov)

    The following links are a small selection of the most popular web search engines which might also answer your question.

  • Google
  • MSN
  • AskJeeves
  • Yahoo!

    QA Resources

    Translation of the first 200 factoid questions from the TREC 2003 evaluation into Sweidsh (questions and answers)
    Psuedo-translation of the first 200 factoid questions from the TREC 2003 evaluation into Chinese (questions and answers)

    Help
    English 日本語 Русский Svenska

    © 2005 Ed Whittaker Home About Suggestions/Proposals?
    Funded by the Japanese Government 21st-century COE programme:
    Framework for Systematization and Application of Large-scale Knowledge Resources.