Photo: Screenshot from official website of Jimeng AI
The
MK sport entry of Chinese tech company ByteDance into the text-to-video industry underscores the intense competition between Chinese and US companies in the rapidly evolving field of AI, particularly after the US-based OpenAI boasted its standout creation, Sora.
Jimeng AI, developed by Faceu Technology under ByteDance, has now been launched on the Apple App Store for users in China. The app was earlier released on Android on July 31, according to media reports.
Industry observers said its creative effectiveness is in keeping with industry benchmarks, showing the strength of keeping pace with international cutting-edge technology.
Following the unveiling of OpenAI's blockbusting text-to-video model Sora in February, which so far remains unavailable for public use, similar models have been released in recent months in China.
Following the release of Sora, many others developers have quickly caught up, including Pika and Runway, China's domestically produced text-to-video or multimodal large models began to gain momentum in around June this year, Li Baiyang, a professor from the data management innovation research center of Nanjing University, told the Global Times.
"Comparing the multimodal large models of China and the US, we are not falling behind at all, and in many parameters and concepts, we are already leading," Li said.
Chinese AI startup Zhipu AI also introduced its own video-generating product Ying in July. Kuaishou's Kling AI has been recognized by overseas professionals for its consistent performance and ability to simulate the characteristics of the physical world to a high degree of accuracy, in addition to its strong conceptual combination ability and imagination.
Noticeably, Vidu, developed by a Chinese tech firm, was made available to users, featuring core functionalities of generating videos and images from text. In just 30 seconds, it can produce a 4-second video with a resolution of up to 1080P. Users can directly register using their email to experience the product.
Vidu's chief scientist and deputy director of the Institute for AI, Tsinghua University, Zhu Jun said, "after the release of Sora, we found that it is just highly consistent with our technical route, which also makes us firmly further promote our own research."
According to Shen Yang, a professor studying AI and media at Tsinghua University in Beijing who frequently tests all kinds of AI products, China and other leading countries are basically synchronized in terms of text-to-video products, with a relatively small gap in terms of the consistency of characters in the video, the causality of the macro physical world, and the cognition of these associations.
Shen told the Global Times on Wednesday that specially, Runway currently excels in artistic expression with a strong cinematic sense, while China's Kuaishou demonstrates a slight advantage in real-world video production. Jimeng AI has not been tested yet by Shen.
Products developed by Chinese firms exhibit a significant level of accuracy in understanding Chinese semantics, in contrast to those developed by US tech companies. For instance, scenarios such as creating traditional Chinese clothing or reliving childhood memories are all based on Chinese language data and adhere to Chinese aesthetic preferences, Li said.
The processing of text-to-video is highly resource-intensive. As the duration of the videos increases, the demand for computing power also rises. To achieve a financial equilibrium, Chinese and US tech firms have started imposing fees for using this service. The high costs associated with this technology pose common challenges for all involved, particularly considering the limited duration of the generated videos, as observed by industry experts.
In the new round of global scientific and technological competition, China will walk out with a "China path," that is, more closely integrated with the industry, or even directly derived from the industrial field, observers stated.