You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The number seemed high because I anecdotally know WordPress market share of CMSs is around 75%, but we're only showing 3.2M WordPress origins in the CMS report.
So if there are 8.8M sites that use a CMS, that puts WordPress's market share at 36%, which is way too low.
The issue seems to be that the WordPress count is taken after joining with the CrUX dataset, and many sites have fallen out of CrUX.
Modifying your query to count WordPress sites in November:
The result is 5785472, which gets us much closer to the expected market share: 65%.
So there are about 2.5M WordPress sites that we're counting in the category total but not in the technology total.
Open to suggestions on how to fix this. One idea is to remove the CrUX join (or do some sort of outer join) when calculating origin counts.
Yeah we subtly changed the name from "CWV Tech Report" to "HTTP Archive Tech Report" so that we could lean more heavily onto the adoption side, so joining forces makes a lot of sense
I see 2 issues here:
we use different URL sets in the report: November crawl is based on Oct CrUX, but we are JOINing it with Nov CrUX. It's either complete or timely. (0.6M discrepancy)
we are not using tablet and NULL clients from CrUX - so more unmatched origins (1.9M). No geo and rank available for aggregation.
A promising analysis logic
Calculate adoption with crawl data, as it's the original source.
This will help us to solve adoption with the most complete set of origins, including the CrUX's tablet and NULL clients.
But only the global ones, geo dimension is part of CrUX and thus unavailable. We could still use INNER JOIN there.
The text was updated successfully, but these errors were encountered:
Quoting @rviscomi :
I see 2 issues here:
tablet
andNULL
clients from CrUX - so more unmatched origins (1.9M). Nogeo
andrank
available for aggregation.A promising analysis logic
Calculate adoption with crawl data, as it's the original source.
This will help us to solve adoption with the most complete set of origins, including the CrUX's
tablet
andNULL
clients.But only the global ones,
geo
dimension is part of CrUX and thus unavailable. We could still use INNER JOIN there.The text was updated successfully, but these errors were encountered: