Generative AI (AIGC) has opened a new era of generalization of artificial intelligence. The competition around large models is spectacular. Computing infrastructure is the primary focus of competition, and the awakening of power has increasingly become an industry consensus. .

In the era of large AI models, new data storage bases promote the digital intelligence transition of education, scientific research

In the new era, large models are moving from single modality to multi-modality, the size of parameters and training data sets is growing exponentially, and massive unstructured data requires the support of high-performance mixed load capabilities; at the same time , data-intensive paradigms have become popular, and application scenarios such as supercomputing and high-performance computing (HPC) are moving in depth, and existing data storage bases are no longer able to meet the needs of continuous upgrades.

If computing power, algorithms, and data are the "troika" driving the development of artificial intelligence, then in the context of huge changes in the external environment, the three urgently need to regain a dynamic balance. The improvement of "soft power" brought about by the improvement of algorithm models and the enhancement of "hard power" caused by the optimization of computing power supply need further support - the "capacity" of data transmission and the "storage capacity" of data storage need to be improved. As a power source, new data storage bases will emerge from the cocoon and become a butterfly in the process of meeting many challenges.

Application scenarios with complex and continuously evolving requirements are the best touchstone for new data storage bases. In this sense, the teaching and scientific research industry is a typical representative: computing power and data are key elements of digital transformation in this field, and scientific research computing with disciplinary integration is equally important as data-based decision support. Moving from HPC to HPDA (High Performance Data Analysis) is a big step to improve the efficiency of teaching and scientific research, and the empowerment of AI can help solve problems that were impossible, inaccurate, and impractical to calculate in the past.

In the era of large AI models, new data storage bases promote the digital intelligence transition of education, scientific research

At the 2023 World Artificial Intelligence Conference held recently, Huawei's OceanStor Pacific distributed storage assisted Shanghai Jiao Tong University in building an HPC AI storage base that was officially launched. The "Turn it over" unified data base will expand by another 25PB this year. It is expected to become a new benchmark for the digital and intelligent transformation of teaching and scientific research, and also set a milestone for the journey of exploring new bases for data storage.

The evolution of the relationship between data and computing power and the derived challenges

With the digital transformation of thousands of industries entering the deep water zone, and the coordinated explosion of emerging technologies such as artificial intelligence and big data, the relationship between data and computing power is undergoing subtle changes.

In the era of large AI models, new data storage bases promote the digital intelligence transition of education, scientific research

The field of education and scientific research is at the forefront of the digital economy and is quite sensitive to this change. In the past, data had to follow computing power. In order to cope with the rapid numerical solution of complex scientific and engineering problems, the education and scientific research community has paid more attention to how to build the most powerful computing power for a long time, while data is only considered as a supporting facility for computing power.

Nowadays, "computing power revolves around data" has gradually become a new trend. The emergence of emerging applications, the expansion of data volume, and the highlighting of data security issues have placed greater emphasis on the value of data itself. Based on breakthroughs in AI, big data and other technologies, traditional supercomputing is evolving into data-intensive supercomputing, and multiple heterogeneous computing power needs to be built around the same data storage base.

Lin Xinhua, deputy director of the Network Information Center of Shanghai Jiao Tong University, believes that the reversal of dominance in data and computing power not only provides an opportunity to build a data-intensive supercomputing platform, but also brings many new innovations to the construction of a unified data storage base. challenges.

First of all, the explosive growth of data has significantly increased the demand for storage capacity. According to statistics, the data scale of the "Jiaowosuan" platform has grown at an annual rate of 7PB. The data volume of application scenarios such as meteorology and oceanography, energy exploration, satellite remote sensing, gene sequencing, cryo-electron microscopy, AI autonomous driving, manufacturing CAE, and animation rendering have all reached Petabyte level, it is not easy to use a data infrastructure to accommodate such a huge amount of data.

Secondly, new businesses are constantly emerging and require higher storage performance. The acceleration of the AI generalization process, especially the batch output of large models and multi-modalities, poses a severe challenge to IO performance. With hundreds of terabytes of data sets becoming the norm, natural language processing and multi-modal applications have accelerated the growth of data volume, and efficient access to small file training data sets requires storage performance to reach a new level.

Thirdly, multi-cluster storage is shared across campuses, and the flow of data between heterogeneous clusters may cause problems such as data loss and slow operation. The "Jiaowosuan" platform provides a variety of heterogeneous computing power, including ARM clusters, X86 clusters, and AI clusters. Among many clusters, only by achieving full data flow and data integration can the maximum value of computing power and data be released.

Finally, traditional AI local disk training, along with high concurrent data analysis, is imminent to break the IO wall. The IO bottleneck in the process of multiple data migrations is very prominent - the traditional reading and writing process is lengthy, loading data involves three data migrations, and checkpoint also involves two data migrations. The efficiency loss caused during this process cannot be ignored.

The breakthrough path of distributed storage unified integrated data base

In order to cope with the above challenges, Shanghai Jiao Tong University and Huawei Storage have launched in-depth cooperation since 2019 to jointly build a "hand over to me" data-intensive supercomputing platform. Relying on its profound accumulation in technology and application innovation, Huawei's OceanStor Pacific distributed storage products help "Tuowo Calculation" build a unified data base to support various heterogeneous computing power platforms across the school.

In the era of large AI models, new data storage bases promote the digital intelligence transition of education, scientific research

Building a distributed unified integrated data base is the only way for "Leave It to Me" to embrace emerging data applications. Based on a horizontally scalable distributed storage architecture, the storage capacity and bandwidth of the "Jiaowosuan" platform can be expanded on demand. The first is linear growth in performance capacity, with a single cluster reaching EB-level capacity; the second is the use of high-density and large-capacity hardware to save cabinet space; the third is the use of large proportions of EC to improve disk utilization with scenario-based compression.

It is understood that the "Jiaowosuan" platform will increase from the initial 2PB capacity and 6GB/s bandwidth to 20PB capacity and 60GB/s bandwidth in 2020, and then expand to 40PB capacity and 120GB/s bandwidth in 2022. It is expected that In 2023, the capacity will be expanded by another 25PB. At the same time, Huawei's OceanStor Pacific distributed storage has an ultra-high-density design of 5U and 120 disk slots. Combined with a large-scale EC data redundancy protection algorithm, it can increase the hard disk space utilization to 91.6% while meeting high reliability.

Distributed all-flash hardware support is the cornerstone of "leave it to me" to solve storage performance problems. With the help of Huawei OceanStor Pacific, the "Turn it over" platform uses all-flash hardware acceleration to significantly improve bandwidth and IOPS performance. Each node has 800,000 IOPS and a bandwidth of 20GB/S, which can meet high-performance requirements under mixed load conditions.

Unified management of global distributed storage across campuses is a good way to solve the problem of multi-cluster storage sharing. By using the global file system to manage multiple sets of storage across domains, the "Jiaowosuan" platform builds a unified data base across campuses. With the support of Huawei's OceanStor Pacific distributed storage products, it achieves global file views, data management and Scheduling, global data flow, unified streaming metadata and other multiple goals.

Data analysis acceleration, multi-protocol access lossless interoperability, and high efficiency without relocation are the powerful tools for "Leave It to Me" to break the IO wall. Based on Huawei's AI-oriented storage solution and Huawei's OceanStor Pacific distributed storage capability of "one data, access through multiple protocols", the "Turn it over" platform realizes external storage to reduce data relocation, greatly improves analysis efficiency, and saves storage. space.

The future picture of HPDA AI in the era of large models

Through the "Jiaowosuan" platform and the evolution trajectory of working with Huawei Storage to create a new base for distributed, unified and integrated data, it is not difficult to see that data-intensive scenarios are accelerating their evolution.

From the early HPC to the later HPDA, and then to the flying wings of HPDA AI, application scenarios in the teaching and scientific research industry have continued to enrich, and the demand for storage products and data bases has also continued to rise. In fact, teaching and scientific research are just the tip of the iceberg in the digitalization process of thousands of industries, and the era of data storage is coming.

The arrival of the big model era will further reshape IT infrastructure, including storage, and storage products with new AI genes are expected to become the new favorite in the digital upgrade of the industry. On July 14, Huawei's AI storage new product launch conference in the large model era, with the theme of "New Data Paradigm Unleashing New Momentum of AI," will be held online. Whether you are deploying AI in your enterprise or developing applications with AI capabilities, the solutions released this time will provide better technical architecture and products to help you keep pace with the times.

The generalization of artificial intelligence has begun. The leader in the storage industry has taken the lead in blowing the clarion call. Every movement that follows is worth looking forward to.

The above is the detailed content of In the era of large AI models, new data storage bases promote the digital intelligence transition of education, scientific research. For more information, please follow other related articles on the PHP Chinese website!