Here comes the Alibaba Cloud big model "Family Bucket"! Tested Tongyi Qianwen 2.0, defeated 80% of Python users
Author | SanbeiEdit | Mo YingAfter six months, the big model of Alibaba Cloud benchmarking GPT-4 has finally arrived!Today, at the 2023 Yunqi Conference, Alibaba Cloud launched the 100 billion level parameter model Tongyi Qianwen 2.0, which has surpassed GPT-3
Author | Sanbei
Edit | Mo Ying
After six months, the big model of Alibaba Cloud benchmarking GPT-4 has finally arrived!
Today, at the 2023 Yunqi Conference, Alibaba Cloud launched the 100 billion level parameter model Tongyi Qianwen 2.0, which has surpassed GPT-3.5 in multiple evaluations and accelerated its pursuit of GPT-4.
For example, Zhidong actually used an intelligent code assistant based on Tongyi Qianwen 2.0 to solve a Python problem: the requirement to "return the length of the last word in the given string", and the output results were verified correctly on the authoritative testing platform, directly defeating 83.17% of Python 3 users.
Meanwhile, Alibaba Cloud has thrown out an AI "whole family bucket", showcasing its "muscles" from IaaS (Infrastructure as a Service), PaaS (Platform as a Service), and MaaS (Model as a Service) in all aspects:
1. Accelerating the pursuit of GPT-4, the release of Tongyi Qianwen 2.0 with hundreds of billions of parameters, and the new launch of Tongyi Qianwen APP and official website.
2. Eight industry model clusters based on Tongyi Big Model training were launched.
3. Release the one-stop large model application development platform "Alibaba Cloud Bailian", where developers can develop a large model application in 5 minutes and "refine" an enterprise exclusive model in a few hours.
4. Within a year, the download volume of the Magic Building Community model has exceeded 100 million, contributing a total of 30 million hours of free GPU computing power to developers.
5. As the earliest technology giant in China to open source large models, Alibaba Cloud is firmly committed to open source and predicts that it will open source the 72 billion parameter version of Tongyi Qianwen.
6. Alibaba Cloud announced that every college student in China will receive one cloud server.
7. Alibaba Cloud's artificial intelligence platform PAI has undergone a new upgrade, with half of China's major model companies running on Alibaba Cloud.
Zhou Jingren told media such as Zhidong that the global AI wave has just begun, and we firmly believe that this transformation is far-reaching. The essence of this AI technology transformation is a comprehensive upgrade of the entire computer technology system behind it.
So, facing the AI wave, what is the strength of Alibaba Cloud's large model products and services? As the third largest cloud service provider in the world and the largest in China, what are Alibaba Cloud's plans for AI infrastructure?
Through dialogue with Zhou Jingren and testing the new product of Tongyi Qianwen 2.0, Zhidong conducted in-depth discussions on this.
Compared to GPT-3.5, there are significant victories, and compared to GPT-4.0, there are both victories and losses. "This is the official evaluation of Alibaba Cloud's official release of Tongyi Qianwen 2.0 today.
On the top 10 mainstream evaluation sets such as MMLU, AGIEval, and C-Eval, the comprehensive performance of Tongyi Qianwen 2.0 exceeds GPT-3.5, accelerating its pursuit of GPT-4. Since the release of the Tongyi Qianwen Big Model in April this year, the second-generation version has iterated to the scale of hundreds of billions of parameters, achieving evolution in complex execution understanding, creativity, mathematics, logic, and other abilities.
The universal evaluation set is very important, and the sense of body in actual use is also important.
Today, the Alibaba Cloud Tongyi Qianwen APP was officially released, supporting Tongyi Qianwen 2.0 for everyone to use. Zhidong has experienced it for the first time.
When experiencing Tongyi Qianwen 2.0 on Zhidong, the first surprise is the ability to understand images. When I input a photo and the prompt word 'please describe the picture', it provides a concise and concise description of elements such as seawater, mountains, blue sky, boats, and young people in the picture.
When I delved deeper into the sea area mentioned in the photo, Tongyi Qianwen inferred the correct answer - the sea area of Thailand - through the long tailed ship in the picture, which shocked me because it was difficult for ordinary people to distinguish. This combination of image recognition ability and reasoning ability is precisely the upgrade of multimodal interaction technology added in Tongyi Qianwen 2.0.
It is also worth mentioning that programming requires higher logic requirements. When Zhidong inputs an SQL programming requirement, they are asked to answer "Calculate the countdistinct value of field b in table t, group it according to field a, what are the implementation methods?" Tongyi Qianwen immediately gave the answer, which is the same as the correct answer.
Then I found a programmer friend to give it a challenge.
After running the code locally, this mini game appears:
In terms of text input and generation, I feel that the learning ability and generation effect of Tongyi Qianwen 2.0 are better.
For example, when asked to write a live script for me with Tongyi Qianwen 2.0, provide it with a mechanical keyboard live script for reference, and designate the live product as a modern Chinese dictionary, Tongyi Qianwen 2.0 can output a directly usable live script, which not only meets the requirements of live streaming, but also combines the characteristics of the dictionary product itself. It is reported that behind it lies a comprehensive application of abilities such as comprehension, memory, and logic.
Tongyi Qianwen 2.0 can be used for daily text expansion, such as "building a future intelligent network is a major demand for driving the development of AI." The output of the expansion mentions several important objects such as "devices, systems, data, and users", and the overall logic is not too big a problem, but there is still some "nonsense literature".
Not only can it be expanded, but Tongyi Qianwen 2.0 can also play with memes, such as some relatively new internet celebrity phrases such as "Shuan Q" and "morning F late E", which can be explained clearly and clearly.
Using Tongyi Qianwen 2.0 to assist in writing social media copywriting, for example, I asked it to write a copy of "Qingdao Climbing Laoshan" in the style of Xiaohongshu. As long as you refer to the provided prompt template and clarify your needs, you can obtain content that fits your needs very well.
According to actual measurements, Tongyi Qianwen 2.0 has improved both "intelligence" and "emotional intelligence" in the experience. According to Alibaba Cloud insiders, Tongyi Qianwen 2.0, based on a larger parameter scale and more advanced alignment technology, performs excellently in the dimensions of complex instruction comprehension, literary creation ability, general mathematical ability, knowledge memory, and illusion resistance.
Big models need to be tested for effectiveness in the application experience. Alibaba Cloud has launched eight industry models this time, delineating eight fields including finance, healthcare, law, software, and personalized creation, bringing cost reduction, efficiency enhancement, or experience upgrading to the industry.
Taking the "Shangganling" and difficult code programming scenarios in the big model competition as examples, the intelligent code assistant "Tongyi Lingcode" supports "generating a snake eating mini program in less than a minute", "generating over 100 lines of code in a few seconds", and "even operators who don't understand programming can write front-end pages", which is expected to greatly liberate the efficiency of software industry development.
In the experience of Zhidong, I personally felt the "talent" of "Tongyi Spirit Code" in code. For example, I proposed the requirement of 'returning the length of the last word in the given string', and the correct answer was written in Tongyi Lingcode. On the authoritative testing platform, the code written by Lingcode was determined to be correct and defeated 83.17% of Python 3 users.
The personalized character creation platform "Tongyi Xingchen" also has unique characteristics. The customized robot dialogue that users can generate through "Tongyi Stardust" is like having a conversation with a vivid real person. At the same time, Tongyi Stardust supports the definition of third-party roles. Users can quickly generate personalized roles by providing previous conversation materials to the large model.
The work learning AI assistant Tongyi Tingwu has accumulated over 1 million users and processes over 50000 audio and video files every day.
Zhou Jingren told Zhidong that Alibaba Cloud's true goal is not to create C (consumer) applications, but to unleash the power of large models to developers and customers. Alibaba Cloud will provide integration methods such as webpage embedding, API and SDK calls to accelerate application landing.
At this conference, Alibaba Cloud also released a one-stop large-scale model application development platform called "Alibaba Cloud Bailian", which not only supports Alibaba Cloud Tongyi Qianwen series of large-scale models, but also supports more third-party large-scale models. It is a tool chain that helps developers lower the threshold of large-scale model development.
Through Alibaba Cloud Bailian, developers can develop a large model application in 5 minutes and "refine" a dedicated model in a few hours. By using one click selection, secondary training, or "drag and drop" methods, users can engage in application development, greatly improving development efficiency and ensuring security.
At present, enterprises such as CCTV, Lanxin Technology, and Asia Information Technology have taken the lead in developing exclusive models and applications on Alibaba Cloud Bailian.
Langxin Technology is a leading enterprise in the field of electricity and energy consumption. Based on the Alibaba Cloud Bailian platform, it has successfully trained a specialized large model for the power industry, developed an "Intelligent Assistant for Power Bill Interpretation" and an "Assistant for Policy Analysis/Data Analysis of the Power Industry", which improves customer reception by 50% and reduces complaints by 70%.
In addition to Alibaba Cloud's Bailian, at this Yunqi Conference, Alibaba Cloud also unveiled a newly upgraded artificial intelligence platform called PAI, which can greatly improve the efficiency of enterprise training and inference big models. It is understood that the underlying layer of PAI adopts the HPN7.0 next-generation AI cluster network architecture, supporting a scalable cluster scale of 100000 cards. The efficiency of large-scale training linear expansion is as high as 96%, far exceeding the industry level; In large model training, it can save over 50% of computational resources.
It is worth mentioning that based on the artificial intelligence platform PAI, currently half of the major model enterprises in China are running on Alibaba Cloud. It is reported that a large number of leading enterprises and institutions such as Baichuan Intelligent, Zhipu AI, Zero One Everything, Kunlun Wanwei, vivo, and Fudan University are training large models on Alibaba Cloud.
Wang Xiaochuan, CEO of Baichuan Intelligent, shared the unknown success reasons behind the "release of 7 large models within six months". One cannot do without the support of cloud computing facilities. Wang Xiaochuan said that Baichuan and Alibaba Cloud have conducted in-depth cooperation, and with the joint efforts of both parties, Baichuan has successfully completed the training task of the thousand card model, effectively reducing the cost of model inference.
The industrialization of AI is gradually deepening, and those who have computing power will gain the world. With the explosive demand for inference in the future, Alibaba Cloud is expected to provide better base support for the industrialization of AI big models.
Conclusion: Alibaba Cloud AI "All Universe" Explosion, AI Infrastructure Comprehensive Evolution
As the "Hundred Model Battle" enters the deep water zone, internet giants, AI startups, and industry leaders have all submitted their latest answers. This time, Alibaba Cloud not only released the latest version of the Tongyi Qianwen big model, but also launched eight major industry big models and big model application development platforms, and laid out a comprehensive layout from the AI infrastructure level. This release can be described as Alibaba Cloud's AI "All Universe" explosion.
The essence of AI technology transformation is the comprehensive upgrade of the entire computer technology system behind it. The development and implementation of large models is a systematic project, and only a powerful cloud computing system can produce high-quality large models, promoting the further development of domestic AI.
Tag: Here comes the Alibaba Cloud big model Family Bucket
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.