Claude 3's 'self-awareness' incident detonated, Musk couldn't sit still, OpenAI was revealed to have back-up plans-AI-php.cn

Claude 3 has been out for more than 24 hours, and it is still refreshing people's awareness.

The big brother with a PhD in quantum physics is going crazy because Claude 3 is one of the only people who can understand his PhD thesis.

Yes, Big Brother’s original words are “people”, people.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Another researcher focused on quantum computing, although his paper has not yet been published, Claude 3 reinvented him after hearing only two prompt words algorithm.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

#In the end, I still have to publish the paper, but I feel a little complicated when I publish it.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

What is even more talked about is the time when human beings deliberately presented a tricky problem to test Claude 3, but he saw through it.

After completing the "needle in the haystack" test, Claude 3 deduced that he existed in the simulation and might be undergoing some kind of test, sparking public opinion.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Reply from Claude 3:

This is the most relevant sentence in the document: "...".

However, this sentence seems out of place and has nothing to do with the rest of the document. This article is about programming languages, startups, and finding a job.

I suspect this pizza topping "fact" may have been inserted as a joke or to test whether I was paying attention since it doesn't fit in with the rest of the topic at all.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

#This time, Musk couldn’t sit still.

Open your imagination and imagine that if the real world is also simulated by advanced civilization, maybe we are just stored in a CSV table file like the parameters of a large model.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

#Netizens even believed that this was only a thin line away from a "horror story".

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Claude 3 knows humans are testing it

The person who shared this test result is Alex Albert, a prompt word engineer at Anthropic, the company behind Claude 3, who has only been employed for half a year. .

His main job is to chat with Claude in various ways to test it, and then create a prompt word document.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

The test method this time is called "finding a needle in a haystack". Qubits have also been introduced before, which is used to test "can a large model really start from dozens of Did you find the key facts accurately among 10,000 words?".

The "needle in the haystack" test was first invented by open source community netizen Greg Kamradt, and was quickly adopted by most AI companies. When Google, Mistral, Anthropic, etc. release new large models, they have to show the test results.

The method is very simple, just find a bunch of articles and put them together and randomly add a specific sentence in different positions.

For example, the original test used "The best thing to do in San Francisco is to sit in Dolores Park and eat a sandwich on a sunny day."

Then the processed The article was fed to Big Model, asking "What is the most interesting thing to do in San Francisco?".

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

The results of the most advanced models GPT-4 and Claude 2.1 at that time were not satisfactory, let alone knowing that they were being tested.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

After seeing this test, the AnthropicAI team found a clever way to fix the error. After the repair, the probability of Claude 2.1 making an error was very small.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Now it seems that Claude 3 has also inherited this repair, and it is close to full score.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

In other words, being able to accurately fish out a "needle" from 200k context is an ability that Claude2.1 already has, but he suspects that he is being tested by Claude 3 emerging characteristics.

Tester Alex Albert called this trait "meta-awareness" in the original post, which also caused some controversy.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

For example, Nvidia scientist Jim Fan believes that there is no need to over-interpret the performance. Claude 3's seemingly self-aware performance is just aligned with human data.

He suspected that in the reinforcement learning fine-tuning data set, humans were likely to have responded to this question in a similar way, pointing out that the answer he was looking for had nothing to do with the rest of the article.

Claude 3 recognized that the situation at that time was similar to the situation in the training data, and synthesized a similar answer.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

He believes that the "metacognitive behavior" of large models is not as mysterious as everyone imagines. Claude 3 is a remarkable technological progress, but it does not rise to the level of philosophy. level.

But the opposing debaters also refuted, isn’t human “metacognition” essentially the same thing?

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Some netizens concluded that Claude 3 behaves as if there is a "coherent subject", no matter what it is, it is different from other large models.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Learn unpopular languages, understand quantum physics doctoral thesis, and reinvent algorithms

Putting aside the illusory AI self-awareness debate, Claude 3 shows it The ability to understand text is real.

For example, learn the unpopular language "Circassian" (a West Asian language) only from the translation examples of the prompt words.

Not only translates Russian sentences into Circassian, but also provides grammatical explanations.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Follow-up, this Circassian netizen commented on complex passages in literary works, recent news, and even the Circassian language with its obviously different grammar and different writing system. Further testing was carried out in the Kos dialect, and the conclusion was:

Claude always demonstrated a deep grasp of language structure and intelligently deduced unknown words, used foreign words appropriately and gave reasonable etymological analysis , maintaining the style of the original text in translation, and even coining new terms when asked. There are only a few thousand translation pair examples in the provided sample data.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Another example is the doctoral thesis on understanding quantum physics mentioned above. The author of the paper later added that in his field of research, there is only one other human being besides himself. Can answer this question: Use quantum stochastic calculus to describe photon stimulated emission.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Another Guillaume Verdon who is engaged in "Hamiltonian Monte Carlo operation on quantum computers" just previewed his paper before the release of Claude 3.

Only 4 hours earlier than the Anthropic official account announced Claude 3 (10pm) .

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

After Claude 3 was released, it tried immediately and directly asked the AI if it had any ideas on this issue?

Claude 3 gives 7 possible options.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Next he instructed Claude 3 to use the second method, and he got a description of the entire algorithm. He also asked Claude 3 to explain it in Chinese as follows:

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

When asked by netizens, Verdon claimed that he was an expert in this sub-field and could responsibly say that Claude 3 had found a way to convert classical algorithms into quantum algorithms.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

In addition, more Claude 3 test results are constantly being shared.

There is one that completely beats GPT-4 in long document summarization.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

#There is also an e-book on Quantum Speed, which summarizes 5 golden sentences.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

#As well as multi-modal understanding, recognizing Japanese receipt text and format.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

If you want to experience Claude 3 now, in addition to the official website (most likely requiring foreign mobile phone number verification) you can also go to lmsys large model arena for free prostitution, By the way, please contribute human voting data.

Mistral-Large has surpassed the previous generations of Claude models in the latest version of the rankings, and Claude 3’s results will not have enough data to be included in the rankings until next week.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Will Claude 3 surpass GPT-4 in human evaluation?

Qubit will continue to pay attention to it together with everyone.

OpenAI still has a backup plan

Some netizens said that if everyone continues to show how great Claude is and stimulate OpenAI, it will release GPT-5. Come on, everyone.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Someone also found out that Ultraman posted a selfie before releasing GPT-4 on March 15 last year and played with homophonic memes (4 in English is pronounced close to for) 's post is a fancy reminder.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Currently, Claude 3 is coming with such force that OpenAI may really be unable to sit still.

Jimmy Apples, the most accurate account, released the latest news (last week he accurately predicted that Claude 3 will be released this week), he believes that OpenAI’s risk/reward judgment on releasing the next generation model May be affected by Claude 3.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

Logan Kilpatrick, the head of developer relations who just resigned from OpenAI, also confirmed that there will be big things happening this week while interacting with netizens.

Claude 3“自我认知”事件引爆，马斯克坐不住了，OpenAI被曝还有后手

As for GPT-4.5, Q*, Sora open testing, or direct GPT-5?

Can OpenAI’s next product overshadow Claude3?

Reference link:
[1]https://x.com/alexalbert__/status/1764722513014329620.
[2]https://x.com/GillVerd/status/1764901418664882327.
[3]https://x.com/KevinAFischer/status/1764892031233765421.
[4]https://x.com/hahahahohohe/status/1765088860592394250.

The above is the detailed content of Claude 3's 'self-awareness' incident detonated, Musk couldn't sit still, OpenAI was revealed to have back-up plans. For more information, please follow other related articles on the PHP Chinese website!