AI-driven Virtual Assistants: development challenges
In my experiences as an AI system developer, digital
assistants are complex systems that require successful implementations of
algorithms from several areas of the Natural Language Processing (NLP) field.
Any development process of a conversational system needs to be also data-driven.
System performance metrics are highly sensitive to the breadth and depth of domain
knowledge in different application areas. Extensive initial testing and constant
monitoring of machine learning outcomes must prevent the algorithms from
drifting into unwanted performance characteristics during the system operation.
As a result, semi-supervised, manual processes need to be present for the
duration of system operations.
Virtual Assistants are probably one of the most exciting
areas of application for AI systems. We all have seen many futuristic movies or
advertisements in which intelligent robots or crazy computers (i.e., HAL 9000)
interact with humans using natural language, image recognition, logic, and
emotions. On the other hand, the infamous paperclip introduced by Microsoft
with one of the first Office products left a long-lasting lousy taste. Despite
many setbacks, this AI application area will likely flourish and develop into a
leading delivery system for online search, question answering, or even
e-commerce. It isn't easy to overstate the importance of conversational
human-computer interfaces.
https://ieeexplore.ieee.org/document/880078
Today, several promising technologies are making rapid progress in the digital assistant field. One of the examples is IPsoft's Amelia.
The concept behind using an avatar to drive the user experience is not new. One of the most significant steps in this direction was the Ms. Dewey system implemented by Microsoft over a decade ago as an addition to their search business model.
At that time, the system based on pre-recorded video clips
had no chance to be flexible enough to become a viable human-computer
interface. However, if Microsoft could transition this concept into a CGI
platform, this could have been a game-changer. We now know that Microsoft's
subsidiaries in the digital entertainment domain could have provided such technology.
If the project continued, it had a chance to set the tone in the virtual
assistant field. Most of the current systems are using CGI-based avatars.
However, it is a safe bet that we will create near perfectly realistic avatars soon.
The gaming industry is almost there with its flagship games utilizing powerful
video cards.
Today, unstructured information is still a hard AI problem.
Some of the critical areas that need to be addressed are:
- Domain knowledge.
Building a digital virtual assistant requires a knowledge base that, at the
current time, needs manual development processes. There is research into
automated knowledge base development, but it is a painstaking, long process at
the end of the day.
- Contextual understanding of basic facts.
Some sources call this capability "naïve logic." Maintaining an
active conversation is problematic because it might relate to many facts that
are not associated with the primary knowledge base. Therefore a foundation of
common sense reasoning is essential. Our experience is that building a
conversational system requires a broad contextual knowledge base that is about
ten times larger than the domain knowledge base. Typically, contextual understanding is
developed through an extensive dialogue analysis and development that involves
focus groups and human testers.
- Affective intelligence.
Digital assistants need to have emotions and understand some of the raw human
emotions expressed during a conversation. Surprisingly, the affective reasoning
field in AI has a lot to show for decades of research.
https://www.questia.com/magazine/1P3-30494056/autonomous-agents-as-synthetic-characters
Affective reasoning is a powerful tool in business settings and might be a deal-breaker
for customer acceptance and widespread use inside and outside organizations.
Studies show that users are willing to accept limited CGI effects or imperfect
text-to-speech capabilities if the conversation's content is engaging, surprising,
or intriguing.
- Machine Learning
Perhaps the most challenging task is allowing the system to learn. The concept
of automated extraction of knowledge from unstructured text is well researched.
However, even the largest organizations with access to the most extensive data
banks did not produce a general-purpose conversational system. A few notable
attempts have been made over the past several decades. The most successful
implementation by the CYC Corporation was primarily a manual endeavor.
https://www.cyc.com/
Today, learning from unstructured information requires a carefully designed
process that allows for efficient monitoring of the algorithm "drift"
and knowledge base extensions. The garbage in, garbage out principle still very
much applies. Unwanted data submitted into the system might lead to
unacceptable conversations, and the dialogue management system might not support
it.
Legal ramifications
Last but not least, we live in a complex legal environment. From the technical
point of view, the legal system is based on interpreting of continually growing
body of mostly unstructured information in various statues and laws. Ideally, a
dialogue management system should be accessing the legal knowledge base and
steering clear from legal trouble.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.