YisouSpider
What is YisouSpider?
YisouSpider is a web crawler operated by SM.CN (Shenma), a Chinese search engine. This bot was first seen in the wild around 2013 and is classified as a search engine crawler. YisouSpider systematically browses the web to discover and index content for Shenma's search results. It works by following links between pages, analyzing content, and adding the information to Shenma's search index.
The crawler identifies itself in server logs with user agent strings like YisouSpider
for its basic version, or with more detailed variants such as Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 YisouSpider/5.0 Safari/537.36
for its more advanced version. Some variants of YisouSpider also appear with mobile user agent strings, suggesting it indexes mobile-specific content as well.
YisouSpider originates from IP addresses primarily based in China, which is consistent with its role supporting a Chinese search engine. You can learn more about Shenma's services at https://zhanzhang.sm.cn/open/info.
Why is YisouSpider crawling my site?
YisouSpider crawls websites to discover and index content for inclusion in Shenma's search results. If you're seeing this crawler in your logs, it means Shenma has found your site and is evaluating its content for potential inclusion in their search index.
The crawler is particularly interested in text content, links, images, and other elements that would be relevant to search queries from Shenma's users. YisouSpider may visit your site regularly, with frequency typically depending on how often your content changes and how valuable Shenma considers your content to their users.
Crawling activity is generally triggered by the discovery of new links pointing to your site, changes in your content, or as part of Shenma's regular re-indexing process. This crawling is a standard part of how search engines operate and is generally considered authorized when done at a reasonable rate.
What is the purpose of YisouSpider?
YisouSpider supports Shenma (SM.CN), a mobile-focused search engine popular in China. The primary purpose of this crawler is to build and maintain a comprehensive index of web content that powers Shenma's search results.
The data collected by YisouSpider is used to determine page rankings, content relevance, and to provide search results to Shenma's users. For website owners, having content indexed by YisouSpider can potentially increase visibility to Chinese audiences using the Shenma search engine.
Being indexed by multiple search engines, including regional ones like Shenma, can diversify your traffic sources and potentially increase visibility in specific markets. However, website owners should be aware that this crawler originates from China and consider their own policies regarding data accessibility in different regions.
How do I block YisouSpider?
YisouSpider does not always respect robots.txt directives according to available information, which means standard blocking methods may not be fully effective. However, it's still worth implementing robots.txt rules as a first step:
User-agent: YisouSpider
Disallow: /
This directive tells YisouSpider that it should not access any part of your website. For more selective blocking, you can specify particular directories:
User-agent: YisouSpider
Disallow: /private/
Disallow: /members/
Allow: /
Since YisouSpider may not consistently honor robots.txt, you might need to implement additional measures. Consider server-side blocking based on the user agent string or IP address ranges. This can be done through .htaccess files on Apache servers or equivalent configuration files on other web servers.
For WordPress sites, security plugins often include options to block specific bots. For custom websites, you might need to implement filtering at the application level or through a web application firewall.
Be aware that blocking YisouSpider will likely reduce your visibility in Shenma search results, potentially limiting your reach to Chinese users who utilize this search engine. However, if you don't target the Chinese market or if you're experiencing excessive crawling that impacts server performance, blocking may be appropriate.
Operated by
Search index crawler
Documentation
Go to docsAI model training
Acts on behalf of user
Obeys directives
User Agent
YisouSpider