文章搜索_php.cn

Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters

Article Introduction：According to news on February 25, Meta announced on Friday local time that it will launch a new large-scale language model based on artificial intelligence (AI) for the research community, joining Microsoft, Google and other companies stimulated by ChatGPT to join artificial intelligence. Intelligent competition. Meta's LLaMA is the abbreviation of "Large Language Model MetaAI" (LargeLanguageModelMetaAI), which is available under a non-commercial license to researchers and entities in government, community, and academia. The company will make the underlying code available to users, so they can tweak the model themselves and use it for research-related use cases. Meta stated that the model’s requirements for computing power

2023-04-14 comment 0 1313

An in-depth comparison of two popular AI language models, ChatGPT and GPT3

Article Introduction：Translator | Reviewed by Zhu Xianzhong | Sun Shujuan Introduction Language model is an important part of natural language processing (NLP), which is a subfield of artificial intelligence (AI) that focuses on enabling computers to understand and generate human language. ChatGPT and GPT-3 are two popular AI language models developed by OpenAI, the industry's leading artificial intelligence research institution. In this article, we will look at the features and capabilities of each of these two models and discuss how they differ. ChatGPT1.ChatGPT Overview ChatGPT is a state-of-the-art conversational language model that has been trained on large amounts of text data from a variety of sources.

2023-04-14 comment 0 1728

AMD premieres Instinct MI300X GPU: Designed for large language models and AI computing

Article Introduction：According to news on June 16, AMD demonstrated its latest InstinctMI300X GPU at the data center and AI technology premiere held on Tuesday. AMD didn't reveal too many details during its keynote, but according to Hoang Anh Phu's findings, the overall power consumption (TBP) of the MI300X is 750 watts, while the TBP of the previous generation MI250X is only 500-560 watts. According to the editor's understanding, MI300X is a pure GPU version that uses AMDC DNA3 technology and is equipped with up to 192GB of HBM3 high-bandwidth memory, designed to accelerate large language models and generative AI calculations. MI300X and its CDNA architecture are designed for large language models and other advanced AI models,

2023-06-16 comment 0 715

Launch of advanced large-scale language model, Meta official announcement to deepen the AI war

Article Introduction：Zuckerberg said on social media that LLaMA developed by Facebook AI Research is "currently the highest level" large-scale language model, with the goal of helping researchers advance their work in the field of artificial intelligence (AI). "Large-scale language models" (LLMs) can digest large amounts of text data and infer relationships between words in the text. With the advancement of computing power and the continuous expansion of input data sets and parameter spaces, the capabilities of LLM have also increased accordingly. Currently, LLM has been proven to efficiently perform a variety of tasks, including text generation, question answering, and summarizing written materials, etc. Zuckerberg said that LLM also has great development prospects in more complex aspects such as automatically proving mathematical theorems and predicting protein structures. value

2023-04-12 comment 0 380

What is the TII Falcon 180B open source language model?

Article Introduction：The Technology Innovation Institute (TII) has made a significant contribution to the open source community with the introduction of a new large language model (LLM) called Falcon. With an impressive 18 billion parameters, the model is a generative LLM available in various versions, including Falcon180B, 40B, 7.5B and 1.3B parameter AI models. When Falcon 40B was launched, it quickly gained recognition as the world's top open source AI model. This version of Falcon, with 4 billion parameters, was trained on a staggering trillion tokens. In the two months since its launch, Falcon 40B has topped HuggingFace’s open source large language model (LLM) rankings. What makes Falcon40B different?

2023-09-12 comment 0 663

360 will launch the 100-billion-level large language model 360GLM and strategically cooperate with Zhipu AI

Article Introduction：IT House reported on May 16 that the official WeChat public account of 360 announced today that it had reached a strategic cooperation with Zhipu AI to jointly develop the 100-billion-level large language model 360GLM. 360 officially stated that the 100-billion-level large language model 360GLM developed by it and Zhipu AI has reached the level of a new generation of cognitive intelligence general model. 360 is committed to implementing the artificial intelligence development strategy of "two wings flying together + four channels concurrently". It claims that through 360's intelligent core technology and advantageous scenarios, large model technology can obtain more extensive and in-depth implementation scenarios, empowering more industries. Zhou Hongyi said that China should establish an industry-research collaborative innovation model of large technology companies + key scientific research institutions, and create China's "Microsoft + OpenAI" combination to lead large-model technology research. He said,

2023-05-27 comment 0 603

Another 'strong player' has been added to the AI field, Meta releases a new large-scale language model LLaMA

Article Introduction：Since ChatGTP became popular, AI applications developed around ChatGTP have emerged one after another; making people feel the power of artificial intelligence! Recently, Facebook parent company Meta released an artificial intelligence large language model (LargeLanguageModelMetaAI), referred to as LLaMA. Zuckerberg said on social media: "The LLaMA model developed by the FAIR team is currently the world's highest level large-scale language model. The goal is to help researchers advance their work in the field of artificial intelligence!" Like other large models, MetaLLaMA works by taking a sequence of words as "input" and predicting the next word to recursively generate text. According to reports

2023-04-25 comment 0 714

Six pitfalls to avoid with large language models

Article Introduction：From security and privacy concerns to misinformation and bias, large language models come with risks and rewards. There have been incredible advances in artificial intelligence (AI) recently, largely due to advances in developing large language models. These are at the core of text and code generation tools such as ChatGPT, Bard, and GitHub’s Copilot. These models are being adopted across all sectors. But how they are created and used, and how they can be misused, remains a source of concern. Some countries have decided to take a drastic approach and temporarily ban specific large language models until appropriate regulations are in place. Here’s a look at some of the real-world adverse effects of large language model-based tools, and some strategies to mitigate them.

2023-05-12 comment 0 867

Chinese language model rush test: SenseTime, Shanghai AI Lab and others newly released 'Scholar·Puyu'

Article Introduction：Heart of Machine Released by Heart of Machine Editorial Department Today, the annual college entrance examination officially kicked off. What is different from previous years is that when candidates across the country rush to the examination room, some large language models have also become special players in this competition. As AI large language models increasingly show close to human intelligence, highly difficult and comprehensive exams designed for humans are increasingly being introduced to evaluate the intelligence level of language models. For example, in the technical report on GPT-4, OpenAI mainly tests the model's ability through examinations in various fields, and the excellent "test-taking ability" displayed by GPT-4 is also unexpected. What are the results of the Chinese Large Language Model Challenge College Entrance Examination Paper? Can it catch up with ChatGPT? Let's take a look at a "

2023-06-07 comment 0 686

The difference between large language models and word embedding models

Article Introduction：Large language models and word embedding models are two key concepts in natural language processing. They can both be applied to text analysis and generation, but the principles and application scenarios are different. Large-scale language models are mainly based on statistical and probabilistic models and are suitable for generating continuous text and semantic understanding. The word embedding model can capture the semantic relationship between words by mapping words to vector space, and is suitable for word meaning inference and text classification. 1. Word embedding model The word embedding model is a technology that processes text information by mapping words into a low-dimensional vector space. It converts words in a language into vector form so that computers can better understand and process text. Commonly used word embedding models include Word2Vec and GloVe. These models are widely used in natural language processing tasks

2024-01-23 comment 965

Zuoyebang released the Galaxy model in the field of education: supports AI problem solving and multi-language AI question and answer

Article Introduction：Kuai Technology reported on September 4 that Zuoyebang officially released its self-developed Galaxy model at the 2023 China International Fair for Trade in Services. It is understood that the Galaxy model supports AI problem-solving, multi-lingual AI Q&A and other capabilities, and is said to be proficient in poetry, text, and common sense. It also supports AI writing functions, which can be used to improve writing skills, optimize writing structure, and provide article polishing and grammatical error correction. and creative inspiration. On the evaluation benchmark, Zuoyebang Galaxy Large Model performed extremely well, ranking first in C-Eval with an average score of 73.7 points. At the same time, it ranked first in the Five-shot and Zero-shot evaluations of the CMMLU list with an average score of 74.03 points and 73.85 points respectively, becoming the first person to rank in the two authoritative lists with an average score at the same time.

2023-09-08 comment 0 359

Linguists are back! Start learning from 'pronunciation”: this time the AI model has to teach itself

Article Introduction：Trying to make computers understand human language has always been an insurmountable difficulty in the field of artificial intelligence. Early natural language processing models usually used artificially designed features, requiring specialized linguists to manually write patterns. However, the final results were not ideal, and AI research even fell into a cold winter. Every time I fire a linguist, the speech recognition system becomes more accurate. Every time I fire a linguist, the performance of the speech recognizer goes up.——Frederick Jelinek After having statistical models and large-scale pre-training models, feature extraction is no longer necessary, but

2023-04-08 comment 0 502

Windows on Ollama: A new tool for running large language models (LLM) locally

Article Introduction：Recently, both OpenAITranslator and NextChat have begun to support large-scale language models running locally in Ollama, which provides a new way of playing for "newbies" enthusiasts. Moreover, the launch of Ollama on Windows (preview version) has completely subverted the way of AI development on Windows devices. It has guided a clear path for explorers in the field of AI and ordinary "water-testing players". What is Ollama? Ollama is a groundbreaking artificial intelligence (AI) and machine learning (ML) tool platform that greatly simplifies the development and use of AI models. In the technical community, the hardware configuration and environment construction of AI models have always been a thorny issue.

2024-02-28 comment 0 735

Intelligent customer service enters the AI 2.0 era. Ronglian Cloud releases a large language model 'Red Rabbit'

Article Introduction：Author: Sun Yan Source: IT Times On July 8, at the World Artificial Intelligence Conference (WAIC2023) Generative Marketing Services and Large Model Forum, Ronglian Cloud released the vertical industry multi-level large language model "Chitu Large Model", generating An integrated intelligent marketing service workspace "Doraemon" and a generative integrated intelligent customer service platform. The "2023 China Intelligent Customer Service Market Report" released by global consulting agency Sullivan mentioned that China's intelligent customer service market size will reach 6.68 billion yuan in 2022, and the market size is expected to grow to 18.13 billion yuan by 2027. Long before the emergence of large models, the customer service industry was already a battleground for AI voice customer service. How will large models change AI customer service? Large model + customer service from cost reduction and efficiency improvement to value creation

2023-07-16 comment 0 289

The 2023 World Artificial Intelligence Conference 'AI Generation and Vertical Large Language Model' forum is coming!

Article Introduction：The current exponential development of AI generation and large language models has brought new development engines to related industry chains and also brought new imagination space for AI landing applications. On the morning of July 7, under the guidance of the World Artificial Intelligence Conference Organizing Committee Office and the Shanghai Pudong New Area Committee of the Communist Youth League, the Shanghai Pudong New Area Youth Federation, Daguan Data, and Youked will co-host the "AI Generation and Vertical Conference" Forum on the topic "The Infinite Charm of Language Models". Preview of the "Three Highlights" of the forum 01 Big names from industry, academia and research gathered to talk about AIGC cutting-edge technology applications. This conference gathered many heavyweight guests such as large language models and AI generation to discuss the latest technology development trends and important challenges. Chai Hongfeng, academician of the Chinese Academy of Engineering and director of the Financial Technology Research Institute of Fudan University (to be invited); Shanghai

2023-06-08 comment 0 534

Linguistics in Artificial Intelligence: Language Models in Python Natural Language Processing

Article Introduction：Natural language processing (NLP) is a field of computer science that focuses on enabling computers to communicate effectively using natural language. Language models play a crucial role in NLP. They can learn probability distributions in language to perform various processing tasks on text, such as text generation, machine translation, and sentiment analysis. Types of Language Models There are two main types of language models: n-gram language model: considers the previous n words to predict the probability of the next word, n is called the order. Neural Language Model: Use neural networks to learn complex relationships in language. Language model in Python There are many libraries in Python that can implement language models, including: nltk.lm: provides the implementation of n-gram language model. ge

2024-03-21 comment 994

Article Introduction：AI large models refer to artificial intelligence models trained using large-scale data and powerful computing power. These models usually have a high degree of accuracy and generalization capabilities and can be applied to various fields such as natural language processing, image recognition, speech recognition, etc. The training of large AI models requires a large amount of data and computing resources, and it is usually necessary to use a distributed computing framework to accelerate the training process. The training process of these models is very complex and requires in-depth research and optimization of data distribution, feature selection, model structure, etc. AI large models have a wide range of applications and can be used in various scenarios, such as smart customer service, smart homes, autonomous driving, etc. In these applications, AI large models can help people complete various tasks more quickly and accurately, and improve work efficiency.

2023-06-29 comment 0 4348

Meta open source AI language model MusicGen can convert text and melodies into complete music

Article Introduction：IT House reported on June 12 that Meta recently open sourced its AI language model MusicGen on Github, which is based on the Transformer model launched by Google in 2017. As the name of the model indicates, MusicGen is mainly used for music generation. It can convert text and existing melodies into complete music. The R&D team said: “We used 20,000 hours of authorized music to train the model, and used Meta’s EnCodec encoder to decompose the audio data into smaller units for parallel processing, thus making MusicGen’s computing efficiency and generation speed faster than AI models of the same type are even better." In addition, MusicGen also supports the combination of text and melody.

2023-06-13 comment 0 1059

UAE releases open source AI large language model Jais with 13 billion parameters

Article Introduction：The UAE team recently unveiled a large Arabic AI model called Jais. The model was developed by a group of engineers, researchers and a Silicon Valley chip company. According to reports, the number of parameters of the Jais large-scale language model has reached 13 billion. These parameters are composed of a large amount of data mixed with Arabic and English. Among them Part of the data comes from computer code. The collaborative project is a joint effort between Silicon Valley-based Cerebras Systems supercomputer, the Emirates Artificial Intelligence University and Inception, a subsidiary of Emirates G42 Technology Group, which focuses on artificial intelligence. In this project, the model was trained on a Cerebras Systems supercomputer named after Jais

2023-09-11 comment 0 657

Autoregressive properties of language models

Article Introduction：Autoregressive language model is a natural language processing model based on statistical probability. It generates continuous text sequences by leveraging previous word sequences to predict the probability distribution of the next word. This model is very useful in natural language processing and is widely used in language generation, machine translation, speech recognition and other fields. By analyzing historical data, autoregressive language models are able to understand the laws and structure of language to generate text with coherence and semantic accuracy. It can not only be used to generate text, but also to predict the next word, providing useful information for subsequent text processing tasks. Therefore, autoregressive language models are an important and practical technique in natural language processing. 1. The concept of autoregressive model. The autoregressive model is a model that uses previous observations to

2024-01-22 comment 0 324