Page Views

Tracking how people interact with your site

Page Views Data Requirements

Overview

Page-views are the foundation of attribution analysis. They represent individual website visits and page interactions that capture the customer journey from first touch to conversion. Roadway supports three different methods for providing page-view data: Segment, Google Analytics 4 (GA4), and Custom data sources.

A page view represents a single page visit by a visitor, containing attribution parameters and referral information necessary for attribution modeling.

Conceptual Requirements

Roadway expects page-view data to conform to the following schema, regardless of your chosen data source method:

Column Name

Data Type

Description

visitor_id

varchar

Unique identifier for the visitor (anonymous or identified). This is typically the anonymous_id from Segment, user_pseudo_id from GA4, or a similar visitor identifier from your tracking system. Required.

page_viewed_at

timestamp

UTC timestamp of when the page view occurred. Required.

user_id

varchar

Unique identifier for authenticated users. This should be NULL for anonymous visitors and populated when a visitor becomes an identified user. Optional but recommended for complete attribution.

landing_page_url

varchar

The full URL of the page that was visited, including query parameters. Required.

referring_page_url

varchar

The full URL of the page that referred the visitor to this page. This can be NULL for direct traffic or when referrer information is not available. Optional.


Data Source Options

Option 1: Segment

If you already import page-view data via Segment to your warehouse, Roadway can leverage the Segment schema to transparently process page-views.

Source Table: Your complete Segment pages table. See Segment's pages table documentation for the complete schema.

Option 2: Google Analytics 4 (GA4)

Alternatively, Roadway can transparently process GA4 tracking data. Use this method if you're already landing GA4 data in your warehouse.

Source: GA4 page_view events collected for an analytics property, exposed via GA4 Export tables.

Often, GA4 data is managed via a GA4 Export for BigQuery, but there are other methods for managing GA4 data.

Option 3: Custom Data Source

In rare cases, customers may set up their own tracking systems, or they may have custom data modeling downstream of exports from tracking solutions like Segment or GA4.

Use this method for cases in which you have custom page-view tracking data or downstream custom models already landed in your warehouse.

Required Table Schema:

create table <your_schema>.page_views (
    visitor_id varchar not null,       -- your visitor identifier
    page_viewed_at timestamp not null, -- utc timestamp of page view
    user_id varchar,                   -- user identifier (nullable)
    landing_page_url varchar not null, -- full page url with parameters
    referring_page_url varchar,        -- referring page url (nullable)
);

Data Validation

  • Non-null visitor_id: Every page view must have a visitor identifier

  • Non-null page_viewed_at: Every page view must have a timestamp

  • Non-null landing_page_url: Every page view must have a URL

  • Unique page views: The combination of visitor_id + page_viewed_at should be unique

  • Valid timestamps: page_viewed_at should be a valid UTC timestamp

Data Integrity Expectations

  • Page view volume: Expect 10-1000x more page views than users depending on your product

  • Anonymous vs. identified traffic: A significant portion of page views will have user_id as NULL (anonymous visitors)

  • URL quality: URLs should be well-formed


Coming Soon

If your traffic- or events-analytics stack is based on the following technologies:

  • Amplitude

  • Mixpanel

  • Rudderstack

Please reach out; we'll be happy to quickly accommodate your needs.

Last updated