Famous: OpenAI releases web crawler tool GPTBot with 'identity mark'

王林
Release: 2023-08-12 17:21:06
forward
1233 people have browsed it

According to news from this site on August 8, OpenAI released its web crawler tool GPTBot yesterday. Officials claim that this GPTBot tool can use a transparent method to collect web page information on the basis of paying attention to copyright to train various AI models under OpenAI .

OpenAI stated that GPTBot uses a proprietary webpage UA to represent its crawler identity, and the complete UA string is (Mozilla / 5.0 AppleWebKit / 537.36 / KHTML, like Gecko; compatible; GPTBot / 1.0; https://openai.com/ gptbot), any website administrator is free to allow or prevent this crawler tool from collecting data.

明牌:OpenAI 发布带有“身份标识”的网络爬虫工具 GPTBot
▲ Picture source OpenAI

明牌:OpenAI 发布带有“身份标识”的网络爬虫工具 GPTBot
▲ Picture source OpenAI

OpenAI claims that if the website administrator does not want to be crawled to collect information, the administrator can completely prohibit GPTBot from crawling information in the robots.txt file of the website server, or he can Determine GPTBot to crawl the specified information on the website.

OpenAI has been criticized by the industry for being accused of privacy violations. Now it has launched the GPTBot crawler tool. This can be seen as a response to external criticism and helps the industry establish crawler tools for AI training. Relevant benchmarks. According to reports, OpenAI recently registered the GPT-5 trademark, and this GPTBot crawler tool is also expected to provide support for GPT-5 related model training

The external jump links (such as hyperlinks, QR code, password, etc.) are only used to provide more information and save screening time. The results are for reference only. Please note that all articles carry this advertising statement

The above is the detailed content of Famous: OpenAI releases web crawler tool GPTBot with 'identity mark'. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:ithome.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template