Can human -machine dialogue cross the hurdle of "soul communication"?

Author:Banyue talk about new media Time:2022.08.25

Dialogue

Huang Minlie (Associate Professor of Computer Department of Tsinghua University, Deputy Director of Natural Language Generation and Smart Writing Society of Chinese Information Society)

Liu Qun (Huawei Noah's Ark Laboratory Chief Scientist, Congress of International Computing Linguistics Society)

Wang Bin (Chairman of the Technical Committee of Xiaomi Group and Director AI Laboratory)

Stills of the movie "Robot Manager"

When the smart speaker "Little Love", "Little Bing", "Little Du" receives you and my voice needs, can it truly understand the intention of human beings? How to judge artificial intelligence is smart enough and easy to use? What technology is behind?

In June of this year, Google engineer Blake Lemoine publicly publicly chat with the chat robot Lamda, and believes that his self -conscious assertion once again triggered public discussion. Although Blake's judgment was denied, in the face of the "talk about empty words", how artificial intelligence technology (AI) better constructed human dialogue experience is worthy of in -depth search.

Language interaction ability is the standard for judging whether the machine is "smart"

Monthly: The artificial intelligence dialogue system has been integrated into all aspects of people's lives. A wake -up words such as "Hi, Siri" and "Small Small Document" can liberate human hands and make machines achieve instructions for us. What possibilities can AI voice technology bring to humans? What areas are the current main application scenarios?

Huang Minlie: The artificial intelligence dialogue system originated in Turing test in the 1950s and is one of the most important research directions in the AI field. What degree of language interaction of a machine has, and even determine whether this machine has the standard of "intelligent" -the earliest Turing test was set in a man -machine dialogue. In a sense, the problem of human -computer dialogue in the open field is tantamount to passing Turing test.

The "ancestor" of the dialogue system is Eliza, which was born in 1966. It can communicate with human beings according to the manual script, but it does not understand the content of the dialogue, but only searches for a suitable reply through the pattern. With the development of deep learning technology, the AI dialogue system has developed from the first generation based on rules and the second generation with traditional machine learning to the third generation with big data and big models. With revolutionary changes, such as showing amazing dialogue on the topic of open topics.

Half -month talk: There is a kind of view that Eliza is 1.0 of the dialogue system, and the voice assistant represented by Siri represents the 2.0 of the dialogue system. Social (chat) robot is the 3.0 of the dialogue system.

Huang Minlie: This is roughly, but this analogy is not accurate enough. At present, the AI dialogue system can be divided into two types. One is called task -oriented type to help users complete specific tasks, that is, such as mobile phone assistants, customer service robots, etc. The other is the conversation system of the open domain, which is the chat robot.

In 2011, Apple launched voice assistant Siri, and the AI dialogue system entered the era of smart assistants. In 2014, Microsoft launched the first social robot, Microsoft Xiaobing, who can chat with Xiaobing. From 2017 to 2019, in the three consecutive Alexa Grand Prix, the best dialogue system can conduct more than 10 minutes of chats with human users. The chat content is not limited by fields and topics. Many large -scale pre -training models appeared in 2020, including Google's Meena, Fair's Blender and Baidu's Plato. The research of the dialogue system entered a new climax.

With the support of big data and large computing power, a more advanced AI dialogue system can not only answer user questions, but also discuss topics in interesting ways. With the development of technology, service robots and social robots will become new members of a smart society. The technological development level of human -machine dialogue determines the possibility of human -machine harmony.

The level of Chinese and foreign AI dialogue system is at the same level

Monthly talk: How to talk about the level of other people's machine conversations? How to judge and reflect the ability of the machine to use natural language to communicate with people?

Huang Minlie: We make a classification of the AI dialogue system: L0 levels have no ability to dialogue at all or cannot give higher quality dialogue; L1 level can complete a higher quality dialogue of a single scene, but it cannot handle the context dependencies between the scenes; L2 levels can complete high -quality dialogue of multiple scenes at the same time, with context dependencies and natural switching capabilities that handle cross -scenario, but cannot complete a high -quality dialogue in new scenes; There can also be a high -quality dialogue in the new scenario; level L4 not only has high -quality dialogue in the new scene, but also has a high degree of anthropomorphic degree; the degree of anthropomorphicization of L5 level is high, not only actively learning and continuous learning, but also It also has the ability to perceive and expression.

Half -month talk: If this hierarchical standard is according to this hierarchical standard, which level of the current domestic chat robot dialogue? What is the international level?

As early as 2016, the University of Science and Technology of China officially released its unique experience interactive robot "Jiajia". "Jiajia" initially has functions such as human -machine dialogue, facial micro -expression, mouth shape and body movement, large -scale dynamic environment independent positioning navigation and cloud services. Liu Junxi Photo

Liu Qun: The level of the industry is generally located in L2 to L3, but it is also dependent on the scene. If it is just a pan -dialogue, it is not so difficult to achieve this scene. But some new scenes are more difficult to reach high -quality dialogue. Wang Bin: From the current industrial application, there is no significant difference between my country's AI dialogue system and foreign AI dialogue system. The overall level is at the same level.

It is difficult to achieve "people", and AI advanced anthropomorphicization is the future goal

Monthly: Sometimes we say a word to AI, and it replies "I can't understand what you are talking about." At present, what are the main challenges that hinder the smooth communication of human -machine?

Liu Qun: The level of dialogue robot understanding is limited, and human knowledge is limited. In some simple areas, the system can model. However, there are some problems, especially open chats. It is difficult for robot dialogue systems to answer such as the system.

In complex scenarios, it is very difficult to make robots fully understand human intentions. Just like between two people, if the cultural background is different, there will be many communication difficulties. We need to inject more knowledge and more scenarios into the dialogue system.

In addition, it is difficult to make AI holders consistency, which requires the dialogue system to have memory ability and have good modeling. In reality, some context "inconsistency" is very vague. For example contradiction. At present, it is difficult to deal with such vague contradictions.

Monthly: The artificial intelligence in many science fiction works is the role of a strong savior. Can such a "type" can be achieved on one day in the future?

Huang Minlie: The people in science fiction movies have a high degree of anthropomorphicness and have a multi -mode perception and expression ability. Traditional human -computer interaction data processing modes are mainly through text, but in the future, they must be truly "people", especially for the Yuan universe. Human emotions and other abilities have improved.

Liu Qun: The highest level of AI dialogue system is complex emotional task. How to promote the application of artificial intelligence in the aspects of emotional companionship, virtual human, and the universe, which greatly reduces the cost of manpower and material resources, and promotes cutting -edge technology into the daily life of the public. At present, many manufacturers have been doing some anthropomorphic exploration, such as giving AI dialogue products to give emotional analysis, emotional guidance, character settings and other capabilities to show a certain degree of anthropomorphicness. Such simple anthropomorphic features are relatively relative easy to accomplish.

Wang Bin: The perception and expression of multimodalism is not as easy as expected. In the real system, the relationship between different modes is very complicated, and how to promote each other to promote each other is a difficult point. Higher anthropomorphicization needs to make the machine's unified understanding and consistent expression of more hidden content.

In the open scenario, the high -end AI dialogue system requires the machine to have initiative, continue to learn and evolve growth. Judging from the current technological evolution and development trends, it is a goal for AI to pursue the entire iterative evolutionary middle school, and it is also a huge challenge.

Source: "Half Moon Talk Internal Edition" 2022, No. 8

Reporter Banyue Talk: Zhang Manzi (Intern Li Yaoqi) | Editor: Zhang Xi

- END -