news

Baidu Encyclopedia restricts search engines such as Google and Bing from crawling to protect content from being used for AI training

2024-08-22

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

【Pacific Technology News】Recently, Baidu Encyclopedia has begun blocking the crawling permissions of most search engines including Google and Bing. This move is aimed at preventing these search engines and other crawlers from crawling its content without authorization for training artificial intelligence models.

According to the update of Baidu Encyclopedia's robots.txt file, currently only a few search engines such as Baidu Search, Sogou Search, China Search (Chinaso), YYSpider and EasouSpider are allowed to crawl its content.

Google Search, Bing Search, Microsoft MSN, UC Browser's Yisouspider and other non-whitelist crawlers are explicitly prohibited from accessing Baidu Encyclopedia data. Although 360 Search is not listed separately in the ban list, Baidu Encyclopedia's policy is to prohibit all non-whitelist crawlers from crawling, so 360 Search and other search engines are also blocked.

Although Baidu Encyclopedia has taken the above measures, industry insiders pointed out that these methods may only prevent most legal crawlers from crawling, and cannot completely prevent small crawlers that bypass restrictions through special means from continuing to obtain content for AI training.