NewsNow bot

What is NewsNow?

NewsNow is a real-time news aggregation and monitoring service operated by NewsNow Publishing Limited, a UK-based company founded in 1997. NewsNow operates as a web crawler that systematically scans publisher websites to collect and index news headlines and articles. The service maintains multiple regional editions including UK, US, Nigeria, Romania, Italy, Canada, and Australia, making it a global news monitoring platform.

The NewsNow crawler functions as a specialized search spider that scans publisher websites for headlines and article content, automatically categorizing and filing them into appropriate newsfeeds based on subject matter, source categorization, and keyword analysis. When visiting server logs, the crawler typically identifies itself with the user-agent string NewsNow in its requests.

Unlike general web crawlers, NewsNow's spider is specifically designed to identify and extract news content, with particular attention to headlines, article text, and author information. It employs a de-duplication system that automatically filters out articles with identical headlines to prevent redundant content in its feeds.

Why is NewsNow crawling my site?

If you're seeing NewsNow in your site's access logs, it's likely because the crawler is scanning your site for news content to include in its aggregation service. NewsNow specifically looks for headlines and article content that it can categorize and display in its various topical newsfeeds.

The crawler visits sites that are part of its publisher network or those it has identified as potential news sources. The frequency of visits depends on how often your site publishes new content and its relevance to NewsNow's categories. Sites that publish news frequently may see more regular visits from the crawler.

NewsNow's crawling is generally considered authorized when sites have explicitly joined their publisher network or when the site's robots.txt file permits access. However, NewsNow may initially crawl sites to evaluate their content before formal inclusion in their network.

What is the purpose of NewsNow?

NewsNow serves as a comprehensive news aggregation platform that collects headlines from numerous sources and organizes them into topical newsfeeds. Its primary function is to provide users with a centralized location to access current news from multiple publishers across various topics and regions.

For publishers, NewsNow offers potential value through increased visibility and traffic. By having content featured on NewsNow's platform, publishers can reach a broader audience interested in their specific topic areas. The service essentially acts as a distribution channel that can drive readers to publisher websites.

For users, NewsNow provides a convenient way to monitor news across multiple sources without having to visit each site individually. The service automatically categorizes content, making it easier to follow specific topics or interests.

How do I block NewsNow?

NewsNow respects the robots.txt protocol, making it straightforward to control its access to your site. If you wish to allow NewsNow to crawl your site but want to specify certain parameters, you can add the following to your robots.txt file:

User-agent: NewsNow
Disallow:

This configuration allows NewsNow to crawl your entire site. If you want to block NewsNow completely, you can use:

User-agent: NewsNow
Disallow: /

This will instruct the NewsNow crawler not to access any part of your site. You can also restrict access to specific directories or files by specifying the paths after the Disallow directive.

If you're a publisher interested in having your content appear on NewsNow, you should be aware that blocking the crawler will prevent your headlines from appearing on their platform, potentially reducing your visibility and traffic from their service. Conversely, if you don't want your content aggregated by NewsNow, blocking the crawler ensures your content won't appear in their newsfeeds.

For publishers wanting to maximize exposure on NewsNow, they recommend implementing specific HTML comments in your templates to help their systems better identify article content, such as using <!-- Article Start --> and <!-- Article End --> tags around your article content.

Something incorrect or have feedback?
Share feedback
NewsNow bot logo

Operated by

Data fetcher

Documentation

Go to docs

AI model training

Not used to train AI or LLMs

Acts on behalf of user

No, operates independently of any user action

Obeys directives

Yes, obeys robots.txt rules

User Agent

NewsNow