news

baidu encyclopedia blocks search engines such as google and bing to prevent content from being crawled to train ai

2024-08-29

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

recently, baidu encyclopedia has begun blocking most search engines such as google and bing. it is expected that this is to prevent these search engines and other crawlers from crawling baidu encyclopedia content without authorization for training ai.

baidu encyclopedia's robots.txt file shows that currently only a few search engines such as baidu search, sogou search, chinaso, yyspider and easouspider are allowed to crawl its content.

google search, bing search, microsoft msn, uc browser's yisouspider and all other search engine crawlers are explicitly prohibited from crawling baidu encyclopedia content.

although 360 search is not listed separately in the banned list, baidu encyclopedia's policy is to prohibit all non-whitelist crawlers from crawling, so 360 search and other search engines are also blocked.

however, baidu encyclopedia's approach is actually only to guard against gentlemen but not villains. there must be many crawlers that continue to crawl content through various means and then use it to train ai.