This company has made products that may trigger the fourth industrial revolution, but they are puzzled: why are their products so popular?
It’s really not Versailles.
Recently, MIT Technology Review interviewed several developers of ChatGPT, giving us a closer look at the story behind this popular AI product.
It’s so hot that there is no defense at all
When OpenAI quietly launched ChatGPT in late November 2022, the startup did not have high expectations.
OpenAI’s employees never thought that their model would be on the road to becoming a top-notch model.
ChatGPT seemed to become a hit overnight, triggering a global gold rush for large language models. However, OpenAI was not prepared at all and could only rush to catch up with its own top. Follow the footsteps of the flow model and try to seize business opportunities.
Sandhini Agarwal, who works on policy at OpenAI, said that within OpenAI, ChatGPT has always been considered a "research preview" - a more complete version of the technology from two years ago. What's more, the company is trying to iron out some of the model's flaws through public feedback.
Who would have thought that such a "preview" product would become popular after its debut by accident.
In this regard, OpenAI scientists are very confused, and they are also very aware of the flowers and applause from the outside world.
“We don’t want to exaggerate this as a huge fundamental progress,” said Liam Fedus, an OpenAI scientist who participated in the development of ChatGPT.
Among the ChatGPT team members, 5 have been named AI 2000 Global Artificial Intelligence Scholars in 2023
To this end, MIT Technology Review reporter Will Douglas Heaven interviewed OpenAI co-founder John Schulman, developers Agarwal and Fedus, and Jan Leike, leader of the alignment team.
Founder John Schulman said that a few days after ChatGPT was released, he would Browse Twitter. There was a crazy period when the Twitter feed was filled with screenshots of ChatGPT.
He thought that this was a product that was very intuitive for users and that it would have some fans, but he did not expect that it would become so mainstream.
Jan Leike said that everything was too sudden and everyone was surprised and struggled to keep up with the explosive pace of ChatGPT. He was curious as to what was driving its soaring popularity. Is there someone behind the scenes? After all, OpenAI itself can’t figure out why ChatGPT is so popular.
## Liam Fedus explains why they were so surprised, because ChatGPT is not the first general-purpose chatbot. Previously Many people have already tried it, so Liam Fedus thinks their chances are not great. However, the private beta version also gave him confidence - perhaps, this A is something that users will really like.
Sandhini Agarwal concluded that ChatGPT’s instant success was a surprise to everyone. So much work has been done on these models that we forget how amazing they are to the general public outside the company.
Indeed, most of the technologies within ChatGPT are not new. It is a fine-tuned version of GPT-3.5, which OpenAI released a few months before ChatGPT. GPT-3.5 itself is an updated version of GPT-3, which appeared in 2020.
ChatGPT team has participated in the previous seven major technology research and development numbers
on the website OpenAI provides these models in the form of application programming interfaces or APIs, and other developers can easily insert the models into their own code.
In January 2022, OpenAI also released InstructGPT, a previous fine-tuned version of GPT-3.5. However, these technologies are not promoted to the public.
According to Liam Fedus’ introduction, the ChatGPT model is fine-tuned from the same language model as InstructGPT, using fine-tuning The method is similar. The researchers added some conversation data and made some adjustments to the training process. So they don't want to exaggerate it as a huge fundamental advance.
It turns out that what plays a big role in ChatGPT is the conversation data.
According to the evaluation of standard benchmarks, there is actually no big difference in the original technical capabilities between the two models. The biggest difference between ChatGPT is that it is easier to obtain and use.
Jan Leike explained that in a sense, ChatGPT can be understood as a version of the AI system that OpenAI has had for some time. ChatGPT is not more capable. The same basic model had been in use on the API for almost a year before ChatGPT came out.
The researchers’ improvements can be summarized as, in a sense, making it more in line with what humans want to do with it. It talks to the user in a conversation, is a chat interface, and is easily accessible. It makes it easier to infer intent, and users can experiment back and forth to achieve what they want.
The secret is the Human Feedback Reinforcement Learning (RLHF) technology, which is very similar to the training method of InstructGPT - teaching it what human users actually like.
Jan Leike said that they asked a large group of people to read ChatGPT's prompts and responses, and then choose between two responses to see which response everyone thought was better. Then, all this data is combined into one training session.
Most of it is the same as what they did on InstructGPT. Like you hope it's helpful, you hope it's true, you hope it's not vicious.
There are also some details. For example, if the user's query is unclear, it should ask follow-up questions to refine it. It should also clarify that it is an artificial intelligence system and should not assume an identity it does not have or claim to have capabilities it does not possess. When the user asks it to do a task it is not supposed to do, it must explicitly refuse.
That is, there is a list of various criteria by which human raters must rank models, such as authenticity. But they will also prefer certain practices, such as AI not pretending to be human.
In general, ChatGPT uses technologies that OpenAI has already used, so the team did not do anything when preparing to release this model to the public. Anything special. In their view, the standards set for previous models were sufficient and GPT-3.5 was secure enough.
In ChatGPT’s training of human preferences, it learned rejection behavior by itself and rejected many requests.
OpenAI set up some "singers" for ChatGPT: everyone in the company sat down and tried to break the model. There are also outside groups doing the same thing. Trusted early users also provide feedback.
Sandhini Agarwal said that they did find that it produced some unwanted output, but these were things that GPT-3.5 also produced. Therefore, if we only look at the risks, ChatGPT is good enough as a "research preview".
John Schulman also said that it is impossible to wait until a system is 100% perfect before releasing it. They have been beta testing early versions for several months, and beta testers have been very impressed with ChatGPT.
What OpenAI is most worried about is actually factual issues, because ChatGPT likes to fabricate things too much. But these problems exist in InstructGPT and other large language models, so in the eyes of the researchers, as long as ChatGPT is better than those models on factuality and other security issues, it is enough.
Based on limited evaluation, before release, it can be confirmed that ChatGPT is more realistic and more secure than other models, therefore, OpenAI decided to continue the release.
After ChatGPT was released, OpenAI has been observing how users use it.
This is the first time in history that a large language model has been placed in the hands of tens of millions of users.
Users are also going crazy and want to test the limits of ChatGPT and where the bugs are.
The popularity of ChaatGPT has also caused many problems to emerge, such as bias issues and problems induced through prompts.
Jan Leike said that some of the things that went viral on Twitter have actually been quietly taken care of by OpenAI.
For example, the jailbreak issue is definitely something they need to solve. Users just like to try to make the model say bad things through some twists and turns. This is within OpenAI's expectation and is also the only way to go.
When jailbreaks are discovered, OpenAI will add these conditions to the training and test data, and all data will be incorporated into future models.
Jan Leike said that whenever there is a better model, they will want to take it out and test it.
They are very optimistic that some targeted adversarial training can greatly improve the jailbreak situation. While it's unclear whether these problems will completely go away, they believe they can make many jailbreaks difficult.
When a system "officially debuts", it is difficult to foresee everything that will actually happen.
So they can only focus on monitoring what people are using the system for, see what happens, and then react to that.
Now, Microsoft has launched Bing Chat, which many people think is OpenAI’s official unannounced GPT-4 a version of.
Under this premise, Sandhini Agarwal said that what they are facing now is definitely much higher than six months ago, but still lower than the level one year later.
The context in which these models are used is extremely important.
For big companies like Google and Microsoft, even if one thing is not true, it becomes a huge problem because they are search engines themselves.
Paul Buchheit, Google’s 23rd employee, who founded Gmail, is pessimistic about Google
As a large language model for search engines , which is completely different from a chatbot just for fun. OpenAI researchers are also working hard to figure out how to move between different uses and create something that is truly useful to users.
John Schulman admitted that OpenAI underestimated how much people care about ChatGPT’s political issues. To this end, they hope to make some better decisions when collecting training data to reduce problems in this area.
Jan Leike said that from his own point of view, ChatGPT often fails. There are so many problems that need to be solved, but OpenAI is not solving them. This, he admitted frankly.
Although language models have been around for a while, they are still in their early days.
Next, OpenAI needs to do more things.
The above is the detailed content of It's really not Versailles! ChatGPT is so successful, even OpenAI doesn't understand it. For more information, please follow other related articles on the PHP Chinese website!