Twingly Recon

What is Twingly Recon?

Twingly Recon is a web crawler operated by Twingly AB, a Swedish company specializing in blog and social media monitoring services. First appearing in server logs around 2010-2011, Twingly Recon functions as a data collection bot that systematically browses the web to gather information for Twingly’s blog and content discovery services. The crawler is designed to find and index blog posts, news articles, and other web content that Twingly then processes and makes available through their various data products.

When visiting your site, Twingly Recon identifies itself through several user-agent strings including Twingly Recon, Mozilla/5.0 (compatible; Twingly Recon; twingly.com), and Twingly Recon-Klondike/1.0 (+https://developer.twingly.com). These identifiers allow website owners to recognize the bot in their server logs and analytics. The crawler operates from various IP addresses, primarily from European servers, with many originating from Sweden where Twingly is headquartered. More information about the crawler can be found at Twingly’s developer documentation.

Why is Twingly Recon crawling my site?

Twingly Recon is likely crawling your site to discover and index content for Twingly’s blog search and monitoring services. If your website contains blogs, news articles, or regularly updated content, Twingly Recon may visit to keep its index current with your latest publications. The frequency of visits typically depends on how often you update your content and its relevance to Twingly’s services.

The crawler is particularly interested in finding new and updated blog posts, news articles, and social media content. Sites that publish frequently or have substantial influence in their topic areas may see more regular visits from the crawler. Twingly Recon’s crawling is generally considered authorized web activity as it’s part of a legitimate service providing content discovery and monitoring tools.

What is the purpose of Twingly Recon?

The primary purpose of Twingly Recon is to power Twingly’s blog search engine and media monitoring services. Twingly offers products that help businesses, organizations, and researchers track mentions of specific topics, brands, or keywords across blogs and other online publications. To provide these services, Twingly needs to continuously discover and index new content from across the web.

The data collected by Twingly Recon is processed and organized to allow Twingly’s customers to search for relevant content, monitor online conversations, and analyze trends. For website owners, particularly bloggers and content publishers, being indexed by Twingly can provide additional visibility as your content may appear in Twingly’s search results and monitoring feeds used by their customers.

This visibility can potentially drive additional traffic to your site from users of Twingly’s services who discover your content through their platform. The crawler itself is not designed to analyze your website’s performance or structure but rather to extract and index the content you publish.

How do I block Twingly Recon?

While Twingly Recon is a legitimate crawler supporting useful services, you may wish to control its access to your site. The information available suggests that Twingly Recon respects the robots.txt standard, although this isn’t explicitly confirmed in their documentation. To restrict the crawler’s access, you can add directives to your robots.txt file:

User-agent: Twingly Recon
Disallow: /

This configuration would block Twingly Recon from crawling any part of your site. If you only want to restrict access to certain sections, you can specify particular directories:

User-agent: Twingly Recon
Disallow: /private/
Disallow: /members/
Allow: /

If you find that robots.txt directives aren’t effective, or if you want more precise control, you might need to implement IP-based blocking through your server configuration or firewall settings. However, this approach requires regular updates as Twingly operates from multiple IP addresses that may change over time.

Blocking Twingly Recon will prevent your content from appearing in Twingly’s blog search and monitoring services. If you value the potential visibility and traffic from these services, consider allowing the crawler while using robots.txt to restrict access to any sensitive or private areas of your site instead of blocking it entirely.

Data collector