Question Answering
Automatic Question Answering typically falls in the field of Computer
Science and Natural Language Processing and the current
state-of-the-art aims to answer simple factual questions that require
short, fairly unambiguous answers such as:
How high is Mount Everest?
Who is the current President of the United States?
When did the Kursk sink?
The demonstration question answering
system you see on these pages is the result of an on-going research
project by Ed Whittaker in the Furui Laboratory at Tokyo Institute of Technology. In
contrast to many other systems which are often rule-based system tuned
and perfected over many many man-years, we adopt a purely statistical,
data-driven and non-linguistic approach to the problem of question
answering. Note that many other systems also use statistical
components but they also tend to use various degrees of linguistic
processing to extract features at the surface-level such as
part-of-speech information to deeper-levels such as parse trees of the
questions. Our system remains rooted in surface features and in
particular the words themselves; currently we don't even use
part-of-speech information. (For the Japanese system however we did
have to compromise and use a rule-based morphological tagger in order
to separate continuous character sequences into word-like units; we
are currently investigating ways of making such segmentation necessary
for languages such as Japanese and Chinese.) We get round the need for
extensive linguistic processing by using large amounts of data
instead. This is often called data redundancy. Rather than converting
the question in to a form that might be observed in the data, we use a
large amount of data in the hope that somewhere there is some text in
a form that more-or-less matches our question and allows us to extract
the answer. We still have to surmount the problem in terms of
identifying that kind of answer is expected (e.g. a name for a
"Who..." question, and a date for a "When..." question...) but the
occurrence in many documents of the correct answer greatly aids this
process.
Automatic Statistical Question Answering
Other Languages
As we said above, the advantage of this data-driven, statistical
approach is that it's pretty easy to extend it to new languages. In
fact, all we need is a set of questions and their corresponding
answers and access to a fairly large amount of text data in that
language (usually the web suffices). For several languages such
question and answer examples can easily be found, for example: English, Russian, European
languages etc...)
Anyway, suffice to say, we're continuously working on other languages
and you'll find them on the main page eventually. In the meantime if
you know of any question-and-answer databases for languages that are
not yet covered, please let us
know.
Speech Interface
Ok, you might think it's getting a bit ahead of ourselves to combine a
question answering system that doesn't work perfectly with a speech
recognition system that doesn't exactly work perfectly, but that's
what we're going to do! Heh, that's what research is about. And
besides a speech interface to a question answering system is a far
better interface (in theory) than keyboard input. When we get it up
and running you'll find a link to it here.
Mobile Interface
Access for mobile phones is provided through the system's address http://asked.jp. A browser detection
script aims to provide a faster-loading minimalist interface to the
system. If you get the page designed for a larger browser you can
force the browser to the correct start page by entering: http://asked.jp/edw/i/index.html or http://asked.jp/edw/i/index_j.html.
Publications
Rapid Development of Multiple Web-based Monolingual Question Answering Systems
E.W.D. Whittaker, J. Hamonic, T. Klingberg, D. Yang, S. Furui
Accepted to the 28th European Conference on Information Retrieval, April 2006.
pdf,
ps
A Unified Approach to Japanese and English Question Answering
E.W.D. Whittaker, J. Hamonic and S. Furui
In Proceedings of the 5th NTCIR Workshop, December 2005
pdf,
ps
TREC 2005 Question Answering Experiments at Tokyo Institute of Technology
E.W.D. Whittaker, P. Chatain, S. Furui and D. Klakow
In Proceedings of the Fourteenth Text Retrieval Conference (TREC), November 2005.
pdf,
ps
A Statistical Pattern Recognition Approach to Question Answering Using Web Data
E.W.D. Whittaker, S. Furui and D. Klakow
In Proceedings of Cyberworlds, November 2005.
pdf,
ps
Other QA Systems on the Web
The following are a couple of QA systems that are available on the web
so you can get an idea of what the "state-of-the-art" is (each link
will open in new window).
AnswerBus
BrainBoost
Language Computer Corporation
START (MIT)
Arizona State University system (Dmitri Roussinov)
The following links are a small selection of the most popular web search engines which might also answer your question.
Google
MSN
AskJeeves
Yahoo!
QA Resources
Translation of the first 200 factoid questions from the TREC 2003 evaluation into Sweidsh (questions and answers)
Psuedo-translation of the first 200 factoid questions from the TREC 2003 evaluation into Chinese (questions and answers)
© 2005 Ed Whittaker Home About Suggestions/Proposals?
Funded by the Japanese Government 21st-century COE programme:
Framework for Systematization and Application of Large-scale Knowledge Resources.