ipasis
Blog/Security Engineering

Advanced Bot Mitigation: Detecting Headless Chrome and Automated Browsers in 2025

December 13, 20256 min read

In 2025, simple property checks like navigator.webdriver are obsolete. Tools like Puppeteer Extra and Playwright's stealth plugins successfully scrub standard automation markers. To detect modern headless instances—including Chrome's --headless=new mode—security engineers must rely on inconsistent browser environments, network-layer fingerprinting, and IP reputation.

1. Runtime Consistency Checks

Automated browsers attempt to mimic human environments but often fail to replicate the complex interplay between hardware and software APIs. The goal is not to find a single flag, but to identify inconsistencies.

The Permissions API Gap

Headless environments often have disparate default permission states compared to a standard browser head. For example, the Notifications API often behaves unpredictably in headless environments.

// JavaScript Detection Logic
async function detectPermissionInconsistency() {
  if (navigator.webdriver) return true; // Low hanging fruit

  try {
    const permissionStatus = await navigator.permissions.query({ name: 'notifications' });
    
    // In many headless configurations, notification permissions are 
    // strictly 'denied' or 'prompt' but behave incorrectly when requested.
    if (Notification.permission === 'denied' && permissionStatus.state === 'prompt') {
      return true; // Inconsistent state typical of some bot configurations
    }
  } catch (e) {
    return false;
  }
  return false;
}

Hardware Concurrency and Memory

Bots hosted on AWS Lambda or budget VPS instances often expose low resource limits that do not match the "high-end" User-Agent string they claim to be.

const isBotConfig = () => {
  const cores = navigator.hardwareConcurrency || 0;
  const memory = navigator.deviceMemory || 0;

  // A "MacBook Pro" User-Agent with 1 CPU core and 2GB RAM is suspicious.
  if (cores < 2 || memory < 2) {
    return true;
  }
  return false;
};

2. Graphics Rendering Fingerprinting

Headless Chrome often relies on software rendering (SwiftShader) or specific GPU drivers (Mesas) typical of Linux server environments, rather than consumer-grade GPUs.

By rendering a hidden WebGL canvas, you can extract the renderer string. If the renderer contains "SwiftShader" or "llvmpipe", the traffic is likely originating from a server environment rather than a consumer device.

function getWebGLRenderer() {
  const canvas = document.createElement('canvas');
  const gl = canvas.getContext('webgl') || canvas.getContext('experimental-webgl');
  if (!gl) return 'Unknown';
  
  const debugInfo = gl.getExtension('WEBGL_debug_renderer_info');
  if (!debugInfo) return 'Unknown';

  return gl.getParameter(debugInfo.UNMASKED_RENDERER_WEBGL);
}

// Usage
const renderer = getWebGLRenderer();
if (renderer.includes("SwiftShader") || renderer.includes("llvmpipe")) {
  console.log("Headless environment detected via WebGL.");
}

3. Network Layer: TLS Fingerprinting (JA4)

Client-side JS checks can be bypassed if the attacker renders nothing or uses a highly sophisticated customized browser. However, the initial TLS handshake reveals the underlying TLS library.

Standard Chrome browsers negotiate TLS differently than Python requests, Node.js axios, or even raw Puppeteer instances. By analyzing the JA4 fingerprint (IP protocol, TCP/QUIC, TLS version, Ciphers, Extensions), you can identify tools masking as browsers.

  • Real Chrome: Specific cipher order, specific GREASE values.
  • Automation: Often missing specific extensions or utilizing a different order typical of the underlying OpenSSL library on Linux.

4. The Infrastructure Layer: IP Intelligence

The most expensive resource for a bot operator is a clean residential IP address. While browser fingerprints can be spoofed, the origin network is harder to mask without significant cost.

Bots rarely run on residential ISPs (AT&T, Comcast, Verizon) directly. They run on datacenter ranges (AWS, GCP, DigitalOcean) or VPN/Proxy endpoints.

Integration Strategy:

  1. Extract client IP.
  2. Query IPASIS API.
  3. Drop traffic if is_crawler is true or if connection_type is hosting.
import requests

def validate_ip(ip_address):
    # Example IPASIS lookup
    response = requests.get(f"https://api.ipasis.com/v1/{ip_address}?key=YOUR_API_KEY")
    data = response.json()

    # Security Policy
    if data.get('is_crawler') or data.get('is_proxy'):
        return False
        
    # Reject Datacenter/Hosting traffic for standard consumer endpoints
    if data.get('connection_type') == 'hosting':
        return False

    return True

FAQ

Q: Can't bots just use residential proxies to bypass IP detection? A: They can, but it increases the cost of the attack significantly. Combined with TLS fingerprinting and JS consistency checks, the cost of operation usually outweighs the value of the scrapable data.

Q: Does --headless=new in Chrome bypass legacy detection? A: Yes. The new headless mode integrates full browser functionality, removing the standard webdriver flag. This makes behavioral and network/IP analysis strictly necessary.

Q: Should I block all headless browsers? A: Not necessarily. If you rely on SEO, you must allow specific verified bots (Googlebot, Bingbot). Validate these via reverse DNS and IP ownership rather than User-Agent strings.


Secure Your Perimeter with IPASIS

Browser fingerprinting is a cat-and-mouse game. IP Intelligence is your stable line of defense. Stop bots at the network edge before they even execute JavaScript.

Get your free API key at IPASIS.com and start filtering hosting, proxy, and bot traffic today.

Start detecting VPNs and Bots today.

Identify anonymized traffic instantly with IPASIS.

Get API Key