Loading…

Input Complexity

How to measure keyboard autocorrect rate and word complexity score for age assurance.

Overview

Input complexity carries the second-highest weight (28%) in the Signal Fusion engine. It measures two aspects of how a user types: how often they rely on autocorrect and the sophistication of their vocabulary. Both are strong age indicators that are computed entirely on-device — no raw text is ever transmitted to A3 or your backend.

Fields Reference

Parameter	Type	Required	Description
`keyboard_autocorrect_rate`	number (0–1)	Required	Ratio of autocorrect-modified keystrokes to total keystrokes. Required when input_complexity is present.
`average_word_complexity_score`	number (0–1)	Required	Average word complexity score computed from word-length distributions. Required when input_complexity is present.

Keyboard Autocorrect Rate

What it measures: The proportion of text input events that were autocorrected by the OS keyboard. Children rely on autocorrect at significantly higher rates (> 0.40) compared to adults (< 0.10).

Signal strength: A rate above 0.40 is a strong child indicator. Below 0.10 suggests adult-level typing proficiency.

How to Collect

Monitor input events on text fields and detect autocorrect replacements via the inputType property. The insertReplacementText input type indicates an autocorrect or auto-suggestion acceptance.

import { useEffect, useRef, useCallback } from 'react';

const TEXT_TYPES = ['text', 'search', 'email', 'url', 'tel', ''];

export function useAutocorrectRate() {
const total = useRef(0);
const corrected = useRef(0);

useEffect(() => {
  function handleInput(e: Event) {
    const input = e as InputEvent;
    const target = e.target as HTMLInputElement;
    if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;
    if (target.type && !TEXT_TYPES.includes(target.type)) return;

    total.current++;
    if (
      input.inputType === 'insertReplacementText' ||
      input.inputType === 'insertCompositionText'
    ) {
      corrected.current++;
    }
  }

  document.body.addEventListener('input', handleInput, true);
  return () => document.body.removeEventListener('input', handleInput, true);
}, []);

const getScore = useCallback(() => {
  if (total.current === 0) return 0;
  return corrected.current / total.current;
}, []);

return getScore;
}

// Usage: const getAutocorrectRate = useAutocorrectRate();
// const keyboard_autocorrect_rate = getAutocorrectRate();

The insertReplacementText input type is supported in Safari, Chrome, and Firefox. On browsers that don't fire this event, the autocorrect rate will read as 0 — which scores as adult-like. This is acceptable since desktop browsers (where autocorrect events are rare) are used predominantly by adults.

Edge Cases

Physical keyboards produce almost no autocorrect events regardless of age. This naturally scores as adult-like, which is reasonable since physical keyboard usage skews older.
Auto-suggest selection (tapping a word from the suggestion bar) may or may not fire insertReplacementText depending on the browser. This is fine — the signal is statistical, not binary.
Non-Latin scripts may have different autocorrect patterns. The 0–1 ratio normalizes across languages.

Word Complexity Score

What it measures: The sophistication of vocabulary used in text inputs, computed as a normalized word-length distribution score. Children typically use shorter, simpler words (score < 0.25), while adults use longer, more complex vocabulary (score > 0.65).

Signal strength: Below 0.25 is a strong child indicator. Above 0.65 suggests adult-level vocabulary.

How to Collect

Observe completed words in text inputs and compute a complexity score based on word length distribution. The algorithm is intentionally simple — it measures average word length normalized against an English baseline. No raw text leaves the device.

import { useEffect, useRef, useCallback } from 'react';

const MIN_AVG = 3.0;
const MAX_AVG = 7.0;

export function useWordComplexity() {
const wordLengths = useRef<number[]>([]);

useEffect(() => {
  function handleInput(e: Event) {
    const target = e.target as HTMLInputElement;
    if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;

    const words = target.value.trim().split(/\s+/);
    if (words.length < 2) return;

    const lastCompleted = words[words.length - 2];
    const cleaned = lastCompleted?.replace(/[^a-zA-Z]/g, '');
    if (cleaned && cleaned.length >= 2) {
      wordLengths.current.push(cleaned.length);
    }
  }

  document.body.addEventListener('input', handleInput, true);
  return () => document.body.removeEventListener('input', handleInput, true);
}, []);

const getScore = useCallback(() => {
  const wl = wordLengths.current;
  if (wl.length < 5) return 0.5; // neutral if too few words

  const avg = wl.reduce((a, b) => a + b, 0) / wl.length;
  return Math.max(0, Math.min(1, (avg - MIN_AVG) / (MAX_AVG - MIN_AVG)));
}, []);

return getScore;
}

// Usage: const getComplexity = useWordComplexity();
// const average_word_complexity_score = getComplexity();

Privacy first: The word complexity score is a single number (0–1) derived from word lengths. The raw text content, individual words, and character sequences are never transmitted. Compute the score on-device and send only the aggregate.

Edge Cases

Emoji-heavy input — emojis are stripped by the replace(/[^a-zA-Z]/g, '') filter. If the user types mostly emojis, few words will be recorded and the score falls back to neutral (0.5).
Short-form content (usernames, search queries) — these produce very short words regardless of age. If your app's primary input is short-form, this signal may be less reliable. Consider giving more weight to behavioral metrics in your integration.
Non-English text — the baseline (3.0–7.0 average word length) is calibrated for English. Other languages have different distributions (e.g., German words average longer, Chinese "words" average shorter). The score still provides relative differentiation within a language, but absolute values may shift.

Putting It All Together

# Complete input_complexity example payload:
curl -X POST https://api.a3api.io/v1/assurance/assess-age \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "os_signal": "18-plus",
  "user_country_code": "US",
  "input_complexity": {
    "keyboard_autocorrect_rate": 0.08,
    "average_word_complexity_score": 0.62
  }
}'

Next Steps

Behavioral Metrics — the highest-impact category (43%)
Device Context — enables hardware normalization
Contextual Signals — IP type and timezone detection

Input Complexity

How to measure keyboard autocorrect rate and word complexity score for age assurance.

Overview

Fields Reference

Parameter	Type	Required	Description
`keyboard_autocorrect_rate`	number (0–1)	Required	Ratio of autocorrect-modified keystrokes to total keystrokes. Required when input_complexity is present.
`average_word_complexity_score`	number (0–1)	Required	Average word complexity score computed from word-length distributions. Required when input_complexity is present.

Keyboard Autocorrect Rate

What it measures: The proportion of text input events that were autocorrected by the OS keyboard. Children rely on autocorrect at significantly higher rates (> 0.40) compared to adults (< 0.10).

Signal strength: A rate above 0.40 is a strong child indicator. Below 0.10 suggests adult-level typing proficiency.

How to Collect

Monitor input events on text fields and detect autocorrect replacements via the inputType property. The insertReplacementText input type indicates an autocorrect or auto-suggestion acceptance.

import { useEffect, useRef, useCallback } from 'react';

const TEXT_TYPES = ['text', 'search', 'email', 'url', 'tel', ''];

export function useAutocorrectRate() {
const total = useRef(0);
const corrected = useRef(0);

useEffect(() => {
  function handleInput(e: Event) {
    const input = e as InputEvent;
    const target = e.target as HTMLInputElement;
    if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;
    if (target.type && !TEXT_TYPES.includes(target.type)) return;

    total.current++;
    if (
      input.inputType === 'insertReplacementText' ||
      input.inputType === 'insertCompositionText'
    ) {
      corrected.current++;
    }
  }

  document.body.addEventListener('input', handleInput, true);
  return () => document.body.removeEventListener('input', handleInput, true);
}, []);

const getScore = useCallback(() => {
  if (total.current === 0) return 0;
  return corrected.current / total.current;
}, []);

return getScore;
}

// Usage: const getAutocorrectRate = useAutocorrectRate();
// const keyboard_autocorrect_rate = getAutocorrectRate();

Edge Cases

Physical keyboards produce almost no autocorrect events regardless of age. This naturally scores as adult-like, which is reasonable since physical keyboard usage skews older.
Auto-suggest selection (tapping a word from the suggestion bar) may or may not fire insertReplacementText depending on the browser. This is fine — the signal is statistical, not binary.
Non-Latin scripts may have different autocorrect patterns. The 0–1 ratio normalizes across languages.

Word Complexity Score

Signal strength: Below 0.25 is a strong child indicator. Above 0.65 suggests adult-level vocabulary.

How to Collect

import { useEffect, useRef, useCallback } from 'react';

const MIN_AVG = 3.0;
const MAX_AVG = 7.0;

export function useWordComplexity() {
const wordLengths = useRef<number[]>([]);

useEffect(() => {
  function handleInput(e: Event) {
    const target = e.target as HTMLInputElement;
    if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;

    const words = target.value.trim().split(/\s+/);
    if (words.length < 2) return;

    const lastCompleted = words[words.length - 2];
    const cleaned = lastCompleted?.replace(/[^a-zA-Z]/g, '');
    if (cleaned && cleaned.length >= 2) {
      wordLengths.current.push(cleaned.length);
    }
  }

  document.body.addEventListener('input', handleInput, true);
  return () => document.body.removeEventListener('input', handleInput, true);
}, []);

const getScore = useCallback(() => {
  const wl = wordLengths.current;
  if (wl.length < 5) return 0.5; // neutral if too few words

  const avg = wl.reduce((a, b) => a + b, 0) / wl.length;
  return Math.max(0, Math.min(1, (avg - MIN_AVG) / (MAX_AVG - MIN_AVG)));
}, []);

return getScore;
}

// Usage: const getComplexity = useWordComplexity();
// const average_word_complexity_score = getComplexity();

Edge Cases

Emoji-heavy input — emojis are stripped by the replace(/[^a-zA-Z]/g, '') filter. If the user types mostly emojis, few words will be recorded and the score falls back to neutral (0.5).
Short-form content (usernames, search queries) — these produce very short words regardless of age. If your app's primary input is short-form, this signal may be less reliable. Consider giving more weight to behavioral metrics in your integration.
Non-English text — the baseline (3.0–7.0 average word length) is calibrated for English. Other languages have different distributions (e.g., German words average longer, Chinese "words" average shorter). The score still provides relative differentiation within a language, but absolute values may shift.

Putting It All Together

# Complete input_complexity example payload:
curl -X POST https://api.a3api.io/v1/assurance/assess-age \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "os_signal": "18-plus",
  "user_country_code": "US",
  "input_complexity": {
    "keyboard_autocorrect_rate": 0.08,
    "average_word_complexity_score": 0.62
  }
}'

Next Steps

Behavioral Metrics — the highest-impact category (43%)
Device Context — enables hardware normalization
Contextual Signals — IP type and timezone detection