Service
Verbatim Generation Analysis
We analyze AI models for verbatim reproduction of specified content.1
Systematic Analysis of Verbatim Generation
"Regurgitation"—the verbatim reproduction of training data—can be decisive in copyright claims when tied to the removal of copyright management information and market substitution analyses.2578
Targeted Prompting
We design scripted prompt suites seeded with shingled fragments from the claimant corpus to elicit verbatim or near-verbatim outputs.3
Large-Scale Analysis
Our platform executes 1,000+ prompts per run, logging tokens, log-probs, and sampling seeds to maintain repeatability and admissibility.4
Similarity Scoring
We run multi-metric similarity scoring (cosine, BLEU, Levenshtein) with confidence intervals to isolate statistically meaningful copying.
Continuous Monitoring
We offer ongoing monitoring services, updating prompt packs alongside model revisions to verify whether remedial steps actually stop regurgitation.
The Deliverable: A Technical Report
You receive a technical report detailing instances of verbatim generation, the prompts that triggered it, and a statistical analysis of the findings. This report provides citable findings for legal teams and creators.6
Source Notes
- S-Square Research prompting study (ver. 2025.06) summarized in "Verbatim Evidence Yield." ↩
- Carlini et al., "Extracting Training Data from Large Language Models," IEEE S&P (2021). Source ↩
- S-Square Research, "Prompt Engineering Comparative Study." ↩
- Carlini et al., "Quantifying Memorization Across Neural Language Models," ICLR (2023). Source ↩
- U.S. Copyright Office, "Title 17, Chapter 12 — Copyright Protection and Management Systems." ↩
- American Bar Association, "Generative Artificial Intelligence: Copyright Considerations." ↩
- U.S. Copyright Office, “AI-generated material registration guidance.” ↩
- Authors Guild v. Google, 804 F.3d 202 (2d Cir. 2015). “Opinion.” ↩