About Qwen

Alibaba Cloud offers the Tongyi Qianwen (Qwen) model series to the open-source community, including

Qwen, a large language model (LLM);
Qwen-VL, a large vision language model;
Qwen-Audio, a large language audio model.

These models are pre-trained on multilingual data spanning various industries and domains, with Qwen-72B trained on an impressive 3 trillion tokens.

Qwen models excel in multimodal understanding and generation, state-of-the-art image processing, and provide fully managed APIs to support innovative generative AI applications.

Upgraded to version 2, Qwen2 includes five model sizes: 0.5B, 1.5B, 7B, 14B, and 72B, featuring significant improvements in coding, mathematics, and enhanced multilingual capabilities across 29 languages, particularly Asian languages. Comprehensive evaluations have tested Qwen2’s performance in areas such as language understanding, coding, reasoning, multilingual proficiency, human preference, long context understanding, safety, and responsibility.

Qwen2 vs. LLaMA 3 and Others Benchmark

Qwen Model Suite advantages

Leading Performance Across Multiple Dimensions
Qwen surpasses other open-source models of similar sizes in benchmark tests, excelling in natural language understanding, mathematical problem-solving, coding, and more.
Effortless and Cost-Effective Customization
Qwen models can be deployed in PAI-EAS with just a few clicks and fine-tuned using your data on Alibaba Cloud or external sources for industry or enterprise-specific tasks.
Applications for the Generative AI Era
Qwen APIs enable the creation of generative AI applications across various scenarios, including writing, image generation, and audio analysis, enhancing work efficiency and transforming customer experiences.

Qwen Model Family

Qwen

Qwen models have been pre-trained on diverse datasets across multiple domains and languages, supporting a context length of 32,768 tokens. These models excel at content creation, text summarization and translation, coding, math problem-solving, tool utilization, and functioning as agents.

Qwen-VL

Qwen-VL, the large vision language model in the Qwen series, generates content from images, text, and bounding boxes. With top-tier performance verified by various evaluation benchmarks, Qwen-VL excels in fine-grained text recognition in both Chinese and English, allowing it to compare and analyze images, create stories, solve math problems, and answer questions.

Qwen-Audio

Qwen-Audio is the large audio language model in the Qwen series. It processes text and various audio inputs, including human speech, natural sounds, music, and songs, and delivers text-based outputs. Qwen-Audio demonstrates impressive performance on the Aishell1, cochlscene, ClothoAQA, and VocalSound test sets, even without task-specific fine-tuning.

Qwen-Agent

Qwen-Agent is a framework designed for building LLM applications that leverage the instruction-following, tool usage, planning, and memory capabilities of Qwen models. It offers a range of components for LLMs, prompts, and agents. Follow this tutorial to learn how to use the Assistant component, enabling you to integrate customized tools and rapidly develop an agent that utilizes these tools.