This page documents how Humaneer evaluates AI detector results. Detector outputs change as vendors update models, thresholds, and interfaces, so every result should be read as a dated snapshot, not a permanent guarantee.
Document the sample
Record source model, prompt, topic, length, and whether a human edited the text.
Record detector context
Capture detector name, date, visible score, label, and any threshold shown in the UI.
Publish limitations
State what the result does and does not prove before using it in a product claim.
Benchmark Methodology
A detector benchmark is only useful when the sample set is repeatable and the pass criteria are clear. Our minimum publishable test includes:
- At least 25 AI-generated samples across different topics and writing formats.
- At least 25 human-written control samples where authorship is known.
- Source model names, prompt families, word counts, detector names, and test date.
- Before and after detector outputs for any Humaneer transformation claim.
- A pass/fail threshold based on the detector label or the detector's own published threshold when visible.
For commercial comparison pages, every competitor should be tested on the same input set and evaluated on detection result, meaning preservation, readability, price, and privacy policy.
Evidence Standard for Public Claims
Strong claim: includes dated screenshots or exported results, detector versions when available, sample counts, exact pass criteria, and a link to this methodology.
Weak claim: says a tool "beats" or "passes" a detector without showing sample size, test date, score threshold, or repeatable inputs. These claims should be rewritten or linked to evidence before being treated as proof.
Known Limitations
AI detectors are probabilistic classifiers. A low AI score is not proof of human authorship, and a high AI score is not proof that a person cheated. Short text, formal academic prose, non-native English writing, technical writing, and heavily edited drafts can all produce inconsistent results.
Humaneer should be used to make writing clearer and more natural. Users remain responsible for following school, employer, publisher, and platform rules.
Current Claim Ledger
| Claim type | Status | Required support |
|---|---|---|
| Detector pass rates | Needs dated evidence per detector | Screenshots or exported results, sample set, threshold, and test date |
| Competitor comparisons | Needs same-input testing | Identical inputs across tools plus pricing and privacy policy review |
| False positive rates | Needs control set | Human-written controls with known authorship and detector result records |
| SEO and AI answer visibility | Measured externally | Google Search Console, Bing Webmaster Tools, and analytics referral data |
Related pages: detector proof, detector accuracy comparison, and AI humanizer alternatives.
External Standards and Crawler References
© 2026 Humaneer. All rights reserved.