Mastodon bot
What is Mastodon?
Mastodon is a free, open-source decentralized social network platform launched in 2016. Unlike centralized social media platforms, Mastodon operates through a federation of independent servers (called "instances") that can communicate with each other through the ActivityPub protocol. Each instance is operated by different administrators but can interact with users on other instances, creating a connected network known as the "Fediverse."
When Mastodon appears in your server logs, it's functioning as a "fetcher" - a specialized type of web crawler that retrieves content on behalf of the Mastodon platform. It identifies itself with a user agent string formatted as Mastodon/{version} (+https://{instance-domain}/)
, for example: Mastodon/4.3.2 (+https://mastodon.social/)
. This format clearly identifies both the Mastodon software version and the specific instance making the request.
Mastodon was created by Eugen Rochko and is maintained by the Mastodon, a non-profit organization. The platform's decentralized architecture means there's no single operator of all Mastodon instances - each server has its own administrators, rules, and community focus.
Why is Mastodon crawling my site?
Mastodon crawls websites primarily to generate link previews when users share URLs in their posts. When a Mastodon user posts a link to your website, the Mastodon instance will dispatch its fetcher to gather metadata about that link, including:
- Page title
- Description or excerpt
- Thumbnail images
- Open Graph metadata
These visits are triggered on-demand when users share links to your content. The frequency of Mastodon's visits directly correlates with how often users share links to your site across the Mastodon network. Unlike search engine crawlers that systematically index websites, Mastodon's fetcher only visits specific pages that have been shared by users.
This crawling behavior is considered authorized and is a standard practice among social media platforms to enhance the user experience when sharing links.
What is the purpose of Mastodon?
The primary purpose of Mastodon's fetcher is to enhance the user experience by providing rich previews when links are shared within the platform. When a user shares a link in a post (called a "toot"), Mastodon retrieves metadata from that webpage to display a preview with an image, title, and description.
This fetching functionality benefits both Mastodon users and website owners. For users, it provides context about links before clicking them, improving engagement and information sharing. For website owners, these rich previews can increase visibility and click-through rates when their content is shared across the Mastodon network.
The data collected by Mastodon's fetcher is used solely for generating these link previews and is not used for building search indexes, training AI models, or other data aggregation purposes. This focused purpose aligns with Mastodon's broader philosophy of user privacy and control.
How do I block Mastodon?
While Mastodon's fetcher provides value by creating rich previews of your content, you may choose to control its access to your site. Mastodon respects the standard robots.txt protocol, allowing you to manage how it interacts with your website.
To block Mastodon completely from accessing your site, add the following to your robots.txt file:
User-agent: Mastodon
Disallow: /
If you want to allow Mastodon to access only certain parts of your site, you can be more selective:
User-agent: Mastodon
Disallow: /private/
Allow: /public/
Remember that blocking Mastodon's fetcher will prevent rich previews when users share links to your site. This might reduce visibility and engagement with your content across the Mastodon network, as users will only see plain URLs without images or descriptive text. Consider whether the benefits of blocking outweigh the potential reduction in social engagement before implementing these restrictions.
For most website owners, allowing Mastodon's fetcher access is beneficial, as it enhances how your content appears when shared across this growing decentralized social network. If you have specific concerns about resource usage, you can use robots.txt to limit access to resource-intensive parts of your site while still allowing previews of your main content.
Operated by
Data fetcher
AI model training
Acts on behalf of user
Obeys directives
User Agent
Mastodon/{version} (+https://{instance-domain}/)