omgili bot
What is omgili?
Omgili (pronounced "oh-em-gee-lee") is a web crawler operated by Webz.io (formerly known as Webhose.io). It functions as a specialized search engine and data collection tool focused primarily on indexing discussion forums, message boards, and other user-generated content platforms. This crawler is designed to separate ordinary web pages from information-rich discussions, helping users find what people are saying about specific topics across the internet.
The crawler identifies itself in server logs with the user agent string omgili/0.5 +http://omgili.com
or similar variants such as omgilibot/0.3 +http://omgili.com/Crawler.html
. It's classified as a web crawler or bot that performs automated HTTP requests to index content.
Omgili's crawler is particularly sophisticated in how it processes forum content. Rather than treating forum pages as simple web documents, its algorithm analyzes them as structured conversations with titles, topics, and replies. This allows Omgili to better understand the context of discussions and extract more meaningful data.
Why is omgili crawling my site?
If you're seeing Omgili in your server logs, it's likely because your site contains discussion forums, comment sections, or other forms of user-generated content that Omgili considers valuable for its index. The crawler is particularly interested in:
- Discussion forums and message boards
- Q&A platforms and community discussions
- Content with conversational structures (questions, answers, debates)
- Multilingual content (Omgili indexes content in multiple languages)
Omgili only crawls content that is publicly accessible to guest users. It doesn't register as a user or attempt to access private or restricted areas of your site unless you've specifically granted it access based on its user agent.
The frequency of Omgili's visits depends on how active your forums are and how valuable Omgili's algorithms consider your content. Sites with higher information density and regularly updating discussions may be crawled more frequently.
What is the purpose of omgili?
Omgili serves as a specialized search engine for discussions and conversations across the web. Unlike traditional search engines that prioritize articles and edited web pages, Omgili focuses exclusively on indexing user discussions to help people find answers to questions that have already been asked and answered.
The data collected by Omgili is used to:
- Power its own search platform, allowing users to search specifically for discussions rather than general web content
- Provide data to Webz.io's clients, including companies that may use this data to train AI models
- Offer customized search solutions for forum owners who want to improve their site's search capabilities
For website owners, particularly forum administrators, Omgili can provide value by driving traffic to your discussions. Forum owners can also integrate Omgili's search functionality into their sites, potentially saving server resources while providing users with more advanced search capabilities.
How do I block omgili?
Omgili respects the standard robots.txt protocol, making it relatively straightforward to control its access to your site. If you wish to block Omgili from crawling your entire site, add the following to your robots.txt file:
User-agent: omgili
Disallow: /
To block Omgili from specific sections of your site:
User-agent: omgili
Disallow: /private-forums/
Disallow: /members-only/
If you want Omgili to crawl your site but don't want it to create cached previews of your content, you can add a meta tag to your pages:
<meta name="omgilibot" content="noarchive">
Blocking Omgili may reduce external traffic to your forums that would otherwise come from its search results. However, if you're concerned about bandwidth usage or how your content might be used in AI training datasets, controlling access might be beneficial.
For forum owners who actually want to encourage Omgili to index their content, you can submit your forum URL through the Webz.io website, though they don't guarantee that all submitted forums will be added to their index.
Operated by
Data collector
Documentation
Go to docsAI model training
Acts on behalf of user
Obeys directives
User Agent
omgili/0.5 +http://omgili.com