On March 11, Stardust AI, a leading international AI data technology company, held its 2024 spring product launch conference in Beijing and launched MorningStar, a data closed-loop product for AI. MorningStar is the first AI data platform focused on data value discovery. Compared with traditional data management tools, this AI data discovery, management, collaboration, and iteration platform with advanced concepts, easy operation, and rich functions is designed for discovering data value. Accelerate model iteration and solve the problem of AI data debt. The creation can support the key link of efficient iteration of enterprise AI data and avoid problems such as accumulation of data debt risk, waste of low-value data costs, and long feedback chain of model training and application effects.
▲ MorningStar officially released
Currently, the MorningStar data management platform is open for applications. You can go to the official website to view more introductions and submit requirements.
1. What is MorningStar?
▲ MorningStar Data Closed Loop
MorningStar is an all-round tool that meets the data management needs of the AI2.0 era, aiming to improve unstructured data management for algorithm engineers. Data management efficiency saves companies data asset management costs and model online iteration time, with leading data life cycle management, comprehensive data mining tools, powerful indicator tracking and difficult case discovery capabilities, efficient and compliant data asset management and other products The advantages far exceed similar products at home and abroad, making algorithm development smoother and more agile, allowing the value of data to be fully released.
2. Who are the users of MorningStar?
By creating a data-centered collaborative environment, MorningStar can eliminate the problem of enterprise AI data debt and mainly serves three types of users: machine learning algorithm engineers, business personnel, and technical managers. For different users, MorningStar can meet various needs, covering rich usage scenarios such as data hard case discovery, model iteration, indicator tracking; data value mining, business effect feedback, operational testing; data element management and enterprise value precipitation.
3. Why choose MorningStar?
Data technology has promoted three changes in artificial intelligence. In the era of big models, all walks of life need to create super employees based on their own data to improve enterprise production efficiency. Models and computing power can be purchased, but data requires refined and full-process management to unlock real value. Enterprises need to build a discoverable, manageable, collaborative, and iterable data pipeline to have the ability to obtain data, produce data, and continuously iterate data, and promote internal data-centered collaboration to obtain data in the AI 2.0 era. core competitiveness.
MorningStar is the only closed-loop data product on the market specifically designed for enterprises in the AI 2.0 era. It comprehensively covers closed-loop links such as data management, iteration, optimization, and mining of AI algorithms from training to production. , is committed to helping enterprises establish efficient data closed-loop systems, maximize data value and optimize model effects, and help create differentiated competitiveness barriers.
▲ MorningStar product advantages
(1) Leading data life cycle management
Algorithm engineers can use MorningStar to manage the life cycle of AI data, strengthen data version control, fast data slicing, traceable data lineage and security control. The platform's automated workflow ensures that data is properly managed and optimized at every stage.
▲Data slicing
Flexible data slicing function allows algorithm engineers to select the algorithm iteration data direction with one click for subsequent data processing processes.
▲Data flow: Record the version production process of data containing different semantic information
Data process orchestration and scheduling, algorithm engineers can easily record the data processing process and semantic results and perform version management, record the full life cycle data information, and ensure the traceability of data and the reproducibility of operations.
▲Data flow: data source and data delivery
Algorithm engineers can conduct model true value comparison through the platform, through a series of data tracing and model debugging and analysis generation tools to discover difficult data and send it to the Rosetta data annotation system with one click.
(2) Comprehensive data mining tool
MorningStar supports in-depth mining of data value, including fine-grained visualization, indicator calculation, data distribution exploration, and cross-modal data Retrieval, etc., can obtain the optimal algorithm at a lower cost through manual supervision, semantic retrieval, feature generation and data enhancement, and help users discover and solve problems in model training through visual data mining logic.
▲Distribution visualization
The above figure shows MorningStar’s ability to find difficult case data and data with abnormal label distribution through visual data mining logic. It has rich Scalability.
▲Data exploration
Algorithm engineers can use MorningStar to conduct data retrieval in various scenarios and dimensions, quickly grasp the data situation, and formulate algorithm experiment ideas .
MorningStar supports various types of multi-modal data visualization and semantic retrieval, making it easier and faster to directionally mine the high-value data required.
(3) Powerful indicator tracking and difficult case discovery capabilities
As the first data closed-loop product that integrates difficult case discovery strategies, MorningStar can ensure that the model training process is reliable Trace iterable. Through a series of data tracing, model debugging and analysis generation tools, it helps to realize and maintain high-quality and reproducible Al models.
▲Data traceability: Through data flow, the data used for algorithm evaluation can be traced to the source at any time.
▲Version comparison
By selecting different data versions, the algorithm prediction results and the true value can be compared, and the visualization function can be used to conveniently locate and analyze difficult case data .
▲Indicator tracking and effect detection
MorningStar uses SDK to conveniently connect the model training environment, training data analysis management, and indicator analysis environment, and conveniently carry out algorithms Iterate.
(4) Efficient and compliant data asset management
MorningStar supports comprehensive analysis of data sets and helps business managers achieve enterprise-level data element management and analysis , presenting asset information from data asset scale, content distribution, ownership and other dimensions at a glance.
▲Data Compliance Audit
Teams can integrate data assets and share use value through MorningStar. Through authority management and usage records, data circulation between departments is accelerated while ensuring data security.
▲Data asset display
In addition, MorningStar integrates multi-source, multi-format, heterogeneous data, manages ultra-large capacity data, and realizes visual modeling of enterprise assets ; Support the classification and inventory of multi-dimensional fine-grained data, promote in-depth understanding of internal data within the enterprise, and improve the efficiency of data flow in cross-department collaboration within the enterprise.
The above figure shows the popularity value ranking of data sets through MorningStar. The value of data assets to algorithm iteration is evaluated through data usage times, scene labels, annotation results, etc. to help data elements. economic benefit analysis.
(5) More functions
As an excellent algorithm engineer, are you still using original self-built tools, temporary tools, or even Excel to process data? As a professional AI data discovery, management, collaboration, and iteration platform, MorningStar not only allows you to perform the above advanced operations, but also has a wealth of practical functions! For example, it supports the unified management of multi-source, multi-format, and heterogeneous structured data; supports SDK, can perform model performance evaluation and monitoring, and obtain a comprehensive model evaluation report.
It is worth mentioning that the CIF-Bench automated evaluation created by Stardust Data and Hong Kong University of Science and Technology will soon be launched on MorningStar! The 28 model evaluation lists focus on evaluating 20 basic dimensions and examining the model's ability to follow instructions on 150 types of tasks. List link: https://yizhillll.github.io/CIF-Bench/.
An autonomous driving algorithm engineer once reported that difficult cases that originally took a day to discover can only be found through the platform in 1-2 hours, greatly improving iteration efficiency.
In the future, MorningStar will continue to carry out iterative updates. Users are welcome to give us valuable suggestions and work with us to reconstruct data closed-loop management to make AI algorithm iterations more efficient!
5. MorningStar is officially released
According to Stardust Data founder & CEO Zhang Lei: "In the AI 2.0 era, mastering your own data means mastering your own model." The core of enterprise data value lies in defining, managing and iterating data. In the ever-evolving wave of AI technology, continuous management, optimization and iteration of data will become a key factor for enterprises to stand out in the AI2.0 era. If your company hopes to use its own data and tens of billions of large-scale models to create its own super employees, MorningStar sincerely invites you to communicate with us. No matter what type of user you are with AI data management needs, MorningStar can provide comprehensive solutions and flexible usage methods, including SaaS, enterprise privatization deployment, and support for customized software development.
Product official website address: https://stardust.ai/MorningStar
Requirement submission address: https://stardust.ai/contact
The above is the detailed content of Stardust Data launches MorningStar, its first product focused on data value discovery. For more information, please follow other related articles on the PHP Chinese website!