Input Complexity
How to measure keyboard autocorrect rate and word complexity score for age assurance.
Overview
Input complexity carries the second-highest weight (28%) in the Signal Fusion engine. It measures two aspects of how a user types: how often they rely on autocorrect and the sophistication of their vocabulary. Both are strong age indicators that are computed entirely on-device — no raw text is ever transmitted to A3 or your backend.
Fields Reference
| Parameter | Type | Required | Description |
|---|---|---|---|
keyboard_autocorrect_rate | number (0–1) | Required | Ratio of autocorrect-modified keystrokes to total keystrokes. Required when input_complexity is present. |
average_word_complexity_score | number (0–1) | Required | Average word complexity score computed from word-length distributions. Required when input_complexity is present. |
Keyboard Autocorrect Rate
What it measures: The proportion of text input events that were autocorrected by the OS keyboard. Children rely on autocorrect at significantly higher rates (> 0.40) compared to adults (< 0.10).
Signal strength: A rate above 0.40 is a strong child indicator. Below 0.10 suggests adult-level typing proficiency.
How to Collect
Monitor input events on text fields and detect autocorrect replacements via
the inputType property. The insertReplacementText input type indicates an
autocorrect or auto-suggestion acceptance.
import { useEffect, useRef, useCallback } from 'react';
const TEXT_TYPES = ['text', 'search', 'email', 'url', 'tel', ''];
export function useAutocorrectRate() {
const total = useRef(0);
const corrected = useRef(0);
useEffect(() => {
function handleInput(e: Event) {
const input = e as InputEvent;
const target = e.target as HTMLInputElement;
if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;
if (target.type && !TEXT_TYPES.includes(target.type)) return;
total.current++;
if (
input.inputType === 'insertReplacementText' ||
input.inputType === 'insertCompositionText'
) {
corrected.current++;
}
}
document.body.addEventListener('input', handleInput, true);
return () => document.body.removeEventListener('input', handleInput, true);
}, []);
const getScore = useCallback(() => {
if (total.current === 0) return 0;
return corrected.current / total.current;
}, []);
return getScore;
}
// Usage: const getAutocorrectRate = useAutocorrectRate();
// const keyboard_autocorrect_rate = getAutocorrectRate();The insertReplacementText input type is supported in Safari, Chrome, and
Firefox. On browsers that don't fire this event, the autocorrect rate will
read as 0 — which scores as adult-like. This is acceptable since desktop
browsers (where autocorrect events are rare) are used predominantly by adults.
Edge Cases
- Physical keyboards produce almost no autocorrect events regardless of age. This naturally scores as adult-like, which is reasonable since physical keyboard usage skews older.
- Auto-suggest selection (tapping a word from the suggestion bar) may or may
not fire
insertReplacementTextdepending on the browser. This is fine — the signal is statistical, not binary. - Non-Latin scripts may have different autocorrect patterns. The 0–1 ratio normalizes across languages.
Word Complexity Score
What it measures: The sophistication of vocabulary used in text inputs, computed as a normalized word-length distribution score. Children typically use shorter, simpler words (score < 0.25), while adults use longer, more complex vocabulary (score > 0.65).
Signal strength: Below 0.25 is a strong child indicator. Above 0.65 suggests adult-level vocabulary.
How to Collect
Observe completed words in text inputs and compute a complexity score based on word length distribution. The algorithm is intentionally simple — it measures average word length normalized against an English baseline. No raw text leaves the device.
import { useEffect, useRef, useCallback } from 'react';
const MIN_AVG = 3.0;
const MAX_AVG = 7.0;
export function useWordComplexity() {
const wordLengths = useRef<number[]>([]);
useEffect(() => {
function handleInput(e: Event) {
const target = e.target as HTMLInputElement;
if (!['INPUT', 'TEXTAREA'].includes(target.tagName)) return;
const words = target.value.trim().split(/\s+/);
if (words.length < 2) return;
const lastCompleted = words[words.length - 2];
const cleaned = lastCompleted?.replace(/[^a-zA-Z]/g, '');
if (cleaned && cleaned.length >= 2) {
wordLengths.current.push(cleaned.length);
}
}
document.body.addEventListener('input', handleInput, true);
return () => document.body.removeEventListener('input', handleInput, true);
}, []);
const getScore = useCallback(() => {
const wl = wordLengths.current;
if (wl.length < 5) return 0.5; // neutral if too few words
const avg = wl.reduce((a, b) => a + b, 0) / wl.length;
return Math.max(0, Math.min(1, (avg - MIN_AVG) / (MAX_AVG - MIN_AVG)));
}, []);
return getScore;
}
// Usage: const getComplexity = useWordComplexity();
// const average_word_complexity_score = getComplexity();Privacy first: The word complexity score is a single number (0–1) derived from word lengths. The raw text content, individual words, and character sequences are never transmitted. Compute the score on-device and send only the aggregate.
Edge Cases
- Emoji-heavy input — emojis are stripped by the
replace(/[^a-zA-Z]/g, '')filter. If the user types mostly emojis, few words will be recorded and the score falls back to neutral (0.5). - Short-form content (usernames, search queries) — these produce very short words regardless of age. If your app's primary input is short-form, this signal may be less reliable. Consider giving more weight to behavioral metrics in your integration.
- Non-English text — the baseline (3.0–7.0 average word length) is calibrated for English. Other languages have different distributions (e.g., German words average longer, Chinese "words" average shorter). The score still provides relative differentiation within a language, but absolute values may shift.
Putting It All Together
# Complete input_complexity example payload:
curl -X POST https://api.a3api.io/v1/assurance/assess-age \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"os_signal": "18-plus",
"user_country_code": "US",
"input_complexity": {
"keyboard_autocorrect_rate": 0.08,
"average_word_complexity_score": 0.62
}
}'Next Steps
- Behavioral Metrics — the highest-impact category (43%)
- Device Context — enables hardware normalization
- Contextual Signals — IP type and timezone detection