Strategic Communications and Marketing Brand Guidelines

Setting Up Google Analytics to Identify AI-Generated Traffic 

Importance

Search behavior is changing. As artificial intelligence-powered platforms like ChatGPT, Perplexity and Google's AI Overviews answer more questions directly in search results, fewer users are clicking through to websites. Identifying traffic from these LLMs helps us understand how our content is surfaced — and how we can optimize it to continue reaching our audiences.

Target Audience 

  • Content Creators/Writers
  • Marketing
  • Web Dev/IT

Definitions and Brand Guidance

LLM (Large Language Model)
A type of artificial intelligence trained to understand and generate text; common sources include ChatGPT, Perplexity and Claude.
Session source/medium
A GA4 dimension that shows where traffic originated.
Regex (Regular Expression)
A search pattern used to match multiple AI domains.
Explorations (GA4)
A flexible reporting workspace for custom visualizations and advanced data analysis.

Follow brand.illinois.edu for naming conventions and design compliance in any dashboards or reporting visuals created.

Instructions

Set Up a Custom Exploration Report

  1. Go to Explore in GA4 and click + Blank.
  2. Name your report: LLM Traffic Analysis.
  3. Add these dimensions:
    • session source / medium
    • page path + query string
    • page referrer
  4. Add these metrics:
    • sessions
    • engaged sessions
    • engagement rate
    • conversions (or your key events)
  5. Apply a Regex Filter
    • Filter by session source / medium using the matches regex option.
  6. Here’s a base pattern to detect traffic from AI platforms. Paste in this recommended regex pattern:
.*(aitastic\.app|bnngpt\.com|chat-gpt\.org|chatgpt\.com|claude\.ai|copilot\.microsoft\.com|copy\.ai|edgepilot|edgeservices|gemini\.google\.com|iask\.ai|neeva|nimble\.ai|openai\.com|perplexity|writesonic\.com).*
  • This pattern is case-insensitive and covers the top AI referrers. Update it monthly as new tools appear.
  1. Break Down the Results:
    • Use rows to group the data meaningfully:
      • First row: session source / medium
      • Optional: add page referrer or page path + query string to see which content is being surfaced
    • Save the Report
    • Click "Save and Add to Library" so others on your team can use it
    • Add it to a custom dashboard for recurring review

Tips and Tricks

  • Add this report to your GA4 Library or Dashboard for easy access. 
  • Schedule quarterly reviews to keep your regex up to date. 
  • Compare AI-sourced content performance with organic search performance to optimize strategic pages. 

FAQ

Q: Why does this traffic show up as "direct" sometimes? 
A: If the AI doesn't pass a referrer header, GA4 may categorize it as “direct” unless you filter for known LLM domains. 

Q: Can I see what content the AI is referencing? 
A: Yesuse page path + query string to identify which pages are being surfaced most by LLMs. 

Contact

Strategic Communications and Marketing Brand Guidelines

507 E. Green Street
MC-426
Champaign, IL 61820

Email: branding@illinois.edu

Phone 217-333-5010

Quick Links