Historical data discrepancy around AI Assitants activities

V
Vladislav Karotki

Hey,

We have been noticing huge decreases in AI Assistants activies for the past 8 weeks: from 20-25k average weekly to 3-5k weekly. We had a look at the stats inside Cloudflare, which shows stable stats, without spikes or decreases week-by-week. Also GA4 shows growth in referral traffic from ChatGPT and other LLMs.

However now we can see different historical data in AI Assistants tab. Weeks that had 3-5k visits now suddenly show about 12-20k visits (same date range picked and compared to the oldestdata we saved). Could someone exmplain such descrepancy and if we can even trust AI Assistants stats for the future? I’m asking this question cause we base all of our strategy and it’s mesurement on UseHall data.

Also Meta-ExternalAgent now appears among other AI Assistants. Isn’t that a training bot?

1 comment

K
Kai Forsyth·

Hi Vlad – thanks for reaching out with these questions.

Our AI site analytics feature works by collecting, reporting, and displaying the data that is sent to our API. We can only visualize what we receive – if certain data isn't being sent to our API (whether due to integration configuration or filtering), it won't appear in your reports. We're confident in the accuracy of our calculations for the data we do receive, and we have extensive tests in place to ensure this.


The sudden changes you're seeing could stem from a few sources. If you're using our Cloudflare workers to forward origin requests, it's possible the worker configuration changed or is only firing on a subset of requests, which would result in lower numbers being reported.


Another thing to check is the "Logs" section to see exactly what's being sent to us. These requests include not just page visits, but also resources like .js files, images, and other assets – all of which we report if the data reaches our API. Changes in what types of requests are being forwarded (whether more or fewer resources are included) could directly cause the increases or decreases you're seeing in the reports.


Similarly, Cloudflare's analytics may be counting different request types than what's being sent to our API, which could explain why their numbers remain stable while ours fluctuate. We're also not entirely sure what Cloudflare classifies as "requests" in their analytics, so there may be differences in how they categorize and count traffic.


Regarding the comparison with GA4, it's worth noting that our integration works at the server-level and relies on referrer information that you send to us, while GA4 operates client-side and has access to more browser-level metrics that can help inform their analytics data. These different approaches mean the platforms are measuring slightly different things.


On Meta-ExternalAgent – we've made a change to recategorize this as a training bot, and we've also added Meta's newly defined bot, Meta-ExternalFetcher, to ensure proper categorization going forward.

You can be confident that the data you're sending to us is being accurately visualized, but whether that's the right data to be sending in the first place is worth investigating on your end as this is unfortunately out side of our control.