This is an epic parent task meant to capture needs and use cases for future prioritization.
Pageviews and other datasets are derived from logs of all webrequests served by WMF.
This has downsides:
- Webrequest logs are huge: filtering these consumes a lot of resources.
- The definition of a pageview is centralized and pattern based. It is difficult to evolve the definition as new needs arise.
Instead of searching for variously shaped needles in a huge haystack, we should provide standardized instrumentation tooling that allows products to produce a event when they consider a page to be 'viewed'.
See also
Please edit this list to keep track of use cases and relevant tasks as they arise
- T368303: REQUEST: Add Special:AllEvents to allowlist for campaigns-product pageview tracking
- T240676: Develop a consistent rule for which special pages count as pageviews
- T304362: Pageview definition relies on X-Analytics to determine special pages
- T113817: Add request_id to webrequest logs as well as other event records ingested into Hadoop
- T310732: "Source of truth" dataset for pageviews
- T336361: [Analytics] Identify access from mobile vs. desktop devices
- T346463: Identify and label prefetch proxy data in our traffic
- T366004: Add page-title to the x_analytics header
- T325544: Update refinery-source PageviewDefinition to better handle `Special:` pages
- T184793: [EPIC] Instrument page interactions
- T186728: Record and aggregate page previews