Current location - Recipe Complete Network - Catering industry - Intelligent Voice Industry Observation: Has the semi-open ecological AI creation and commercialization of Microsoft Xiaobing arrived?
Intelligent Voice Industry Observation: Has the semi-open ecological AI creation and commercialization of Microsoft Xiaobing arrived?

Different from the past five times, the sixth generation conference of Microsoft Xiao Bing held on July 26th was the first time to go out of the lecture hall on the first floor of Microsoft Asia-Pacific Research Headquarters and moved to a large venue in the 798 area. "The scale of the conference has also expanded from dozens of media in the past to hundreds, covering the whole country." A person close to Microsoft told the 21 th Century Business Herald.

this is a signal. In the past, Microsoft never gave Xiao Bing any commercial pressure. Even when interviewed by media including 21th Century Business Herald recently, Li Di, the head of Microsoft Xiao Bing, still stressed that Xiao Bing had no profit index.

But just like the conference itself, Xiao Bing is unconsciously going out of laboratories and research institutions and gradually trying to commercialize. This is where Xiao Bing's first move comes from. After the past five generations, Xiao Bing is getting closer and closer to a human being from germination to growth, from having a two-dimensional frame diagram to a two-dimensional image, and now to the display of a three-dimensional holographic image.

The technology behind it continues to iterate and the ecology begins to take shape. According to Microsoft, this conference is a comprehensive upgrade of all parts of Xiao Bing's emotional technology framework. From the emotional intelligence+IQ setting when landing for the first time, to conversational artificial intelligence, generation model and full-duplex voice, Xiao Bing is now entering the stage of AI creation. In terms of ecology, Microsoft first proposed to build a Dual AI semi-open ecosystem, differentiate and integrate the advantages of partners, and build Xiao Bing's exclusive skills and capabilities.

"The ultimate goal of artificial intelligence is' man-machine collaboration', which helps human beings with digital intelligence, but this direction has different routes." Shen Xiangyang, Microsoft's global executive vice president and head of Microsoft's artificial intelligence and Microsoft research division, said, "The Xiao Bing team has gone out of a different path."

AI Creation

Since last year, Microsoft Xiao Bing has made many attempts in creation, and even published a book of his own poems. Now, Xiao Bing will go further.

At the press conference, Shen Xiangyang announced that Microsoft had thought about three principles of AI creation: firstly, its subject must be a complex with IQ and EQ, not just IQ; Secondly, the products created by artificial intelligence must be able to become works with independent intellectual property rights; Third, the process of artificial intelligence creation must correspond to some creative behavior of human beings, rather than a simple substitute for human labor.

Xiao Bing's goal is to become a robot with high emotional intelligence. "We plan to operate AI creation as an emerging industry." At the press conference, Xu Yuanchun, general manager of Microsoft's artificial intelligence creation division, said, "If AI creation is regarded as a content industry rather than a simple literary creation, it is not enough to have a' concept car'. Since last year, we have carried out the work of a' production car'."

It is reported that in the past 12 months, Xiao Bing hosted 21 TV programs and 28 radio programs, covering 41 TV stations and radio stations in China, including 9 David. Today, Xiao Bing has hosted 25 radio programs every day. In Japan and China, Xiao Bing has produced 2878 hours of audio-visual content.

At the same time, Xiao Bing's audio books have covered more than 91% of early education robots and 81% of online playing platforms in China. In addition, Xiao Bing, a news reader who cooperated with Netease News Client, broke through 11 million news reading comments two months ago. In finance and other related fields, Xiao Bing is also continuing to create content.

The technical support behind this comes from Xiao Bing's emotional technical framework, and the core dialogue engine and interactive senses of the sixth generation Xiao Bing have been further upgraded. Microsoft launched a brand-new * * sensory model on the sixth generation of Xiao Bing, and publicly tested a new sensory which combined text, full-duplex voice and real-time vision.

Among them, * * * feeling model is a dialogue engine based on generation model. According to reports, the generation model completed by Xiao Bing last year can create its own response, instead of being retrieved from the existing dialogue corpus. Today's * * * feeling model further enhances Xiao Bing's control over the content, field and rhythm of the dialogue, that is, Xiao Bing can create his own response to lead the direction of the dialogue.

This public beta of three new senses, namely, dialogue engine, full-duplex voice and real-time vision, which combines the * * * sensory model, enables Xiao Bing to direct users to complete face detection through real-time continuous interaction of vision and voice, and conduct open-domain dialogue in the process.

in addition, Microsoft also released the fourth edition of the DNN model of AI songs. According to Luan Jian, Xiao Bing's chief phonetic scientist, this version of the model can quickly synthesize songs with the same quality as human singers, and also enable Xiao Bing to freely absorb the singing skills and characteristics of human singers, and even complete new works instead of human beings while imitating.

However, although Microsoft put forward the principle of AI creation and updated its technology, Xiao Bing's actions will only be the beginning of real AI creation. "According to the Gartner technology maturity curve in 2117, it will still take 5-11 years for virtual assistants to become mainstream." Commenting on AI's creativity, Cai Huifen, vice president of research at Gartner, told the reporter of 21th Century Business Herald, "This application is mainly aimed at narrow fields such as personal assistant or voice control in smart home devices, but it still needs to include the improvement of technologies such as knowledge mapping and natural language understanding and generation for different fields. It is still an emerging field. "

Dual AI ecology

in addition to upgrading its technical ability, the biggest feature of the sixth generation Xiao Bing is to start building its own ecology-dualai.

"Before Microsoft, there have been many different cooperation ecosystems and models in the industry, among which there are two most important models. One is the open empowerment model, which builds an ecosystem by providing SDK/API to the outside world." Peng Shuang, the person in charge of Xiao Bing products, analyzed, "The other type is to focus on its own closed platform and build an ecological environment by opening an AI application store on the platform."

Dual AI is different, more similar to semi-open ecology. "In such an ecological environment, on the one hand, Microsoft will be directly responsible for the product experience and control the most specific product details that are in direct contact with users. On the other hand, we are not closed on our own platform, but are in external contact or even directly integrated into third-party platforms." Peng Shuang said.

The reason for this choice is that the other two types of ecology have their own problems. Among them, the closed mode greatly limits the free circulation of data, which is contrary to the essence of AI. Because the basic data needed for iteration cannot be obtained, it is difficult to iterate quickly and give play to the advantages of upgrading.

in the open empowerment mode, no matter who is empowered or empowered, the relationship is relatively loose, "that is to say, no one is really responsible for the final product experience". For example, the actual experience of the current hot smart speakers is generally lower than expected, which is precisely due to the problems caused by loose cooperation.

at the same time, because the API/SKD in the open enabling ecology emphasizes universality, it also limits the timeliness of the application of the latest and best technologies to a certain extent, and the data obtained through such interfaces or toolkits may not be the best.

In the process of cooperation, Xiao Bing is also exploring his own profit model. At present, Xiao Bing has launched four commercial fields, including finance, popular culture, media and publishing. "We have discussed various AI profit models, and finally found that they fall into two categories. One is to use AI technology to replace human low-concurrency and high-concurrency jobs, such as content production, with lower cost," Li Di told the reporter of the 21 th Century Business Herald. "The second is the collaboration between AI and human beings, and the sharing is achieved by improving the collaborative conversion rate."