YandexScreenshotBot
What is YandexScreenshotBot?
YandexScreenshotBot is a specialized web crawler operated by Yandex, Russia's leading search engine company. It's designed to capture visual snapshots of web pages, which are then used to generate thumbnail previews for Yandex's search results. Unlike general-purpose crawlers that focus on indexing text content, YandexScreenshotBot specifically renders and captures the visual layout of web pages to enhance the search experience.
The bot identifies itself in server logs with a user agent string that looks like: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots)
. This string contains information about the bot's name, version, and a link to Yandex's official documentation.
YandexScreenshotBot uses a Chromium-based rendering engine to fully execute JavaScript, render CSS, and load images. This ensures that the screenshots accurately represent how your page appears to actual users. Unlike some other crawlers, it renders the entire page as a real browser would, making it particularly resource-intensive compared to text-only crawlers.
Why is YandexScreenshotBot crawling my site?
YandexScreenshotBot crawls websites to create visual snapshots that enhance Yandex search results with preview images. If you're seeing this bot in your logs, it means Yandex has discovered your content and is capturing how it visually appears to users.
The frequency of visits from YandexScreenshotBot typically depends on several factors. Popular pages with high traffic volumes or frequent updates are likely to be crawled more often. Websites targeting Russian-speaking audiences generally experience more frequent visits since Yandex is particularly popular in Russia and neighboring countries.
This bot's crawling is considered authorized as part of Yandex's normal search engine operations. The bot follows standard web crawling conventions and provides documentation about its purpose and behavior.
What is the purpose of YandexScreenshotBot?
YandexScreenshotBot serves multiple functions within the Yandex ecosystem. Its primary purpose is to generate thumbnail images that appear alongside search results, providing users with visual previews of websites before they click through. This visual enhancement helps users quickly identify relevant content and improves the overall search experience.
Beyond search result previews, the bot also helps with quality assurance by verifying that advertised landing pages match their displayed content, particularly for Yandex.Direct ads. Additionally, it assists in detecting duplicate content across different domains by comparing visual fingerprints of web pages.
For website owners, YandexScreenshotBot's activities can provide value by improving how your site appears in Yandex search results. Attractive, properly rendered previews may increase click-through rates from Yandex users, particularly those in Russian-speaking markets.
How do I block YandexScreenshotBot?
YandexScreenshotBot generally respects robots.txt directives, but with an important caveat: it may ignore rules defined under the generic User-agent: *
directive if the parent HTML page is accessible. To effectively block this bot, you need to target it specifically in your robots.txt file:
User-agent: YandexScreenshotBot
Disallow: /
This directive will instruct YandexScreenshotBot not to crawl any part of your website. If you want to block access to specific sections only, you can replace the "/" with the path to those sections.
For websites handling sensitive personal data, you might want to implement additional protections. Using session validation to restrict access to authenticated users only can prevent the bot from capturing sensitive content. You can also use noarchive
meta tags or X-Robots-Tag
headers to prevent caching of specific content.
Keep in mind that blocking YandexScreenshotBot may result in your site appearing without thumbnails in Yandex search results, which could potentially reduce click-through rates from users of that search engine. If you target markets where Yandex is popular, consider the trade-offs before implementing complete blocking.
Operated by
Search index crawler
Documentation
Go to docsAI model training
Acts on behalf of user
Obeys directives
User Agent
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots)