Service

Verbatim Generation Analysis

We analyze AI models for verbatim reproduction of specified content.1

Systematic Analysis of Verbatim Generation

"Regurgitation"—the verbatim reproduction of training data—can be decisive in copyright claims when tied to the removal of copyright management information and market substitution analyses.2578

Targeted Prompting

We design scripted prompt suites seeded with shingled fragments from the claimant corpus to elicit verbatim or near-verbatim outputs.3

Large-Scale Analysis

Our platform executes 1,000+ prompts per run, logging tokens, log-probs, and sampling seeds to maintain repeatability and admissibility.4

Similarity Scoring

We run multi-metric similarity scoring (cosine, BLEU, Levenshtein) with confidence intervals to isolate statistically meaningful copying.

Continuous Monitoring

We offer ongoing monitoring services, updating prompt packs alongside model revisions to verify whether remedial steps actually stop regurgitation.

The Deliverable: A Technical Report

You receive a technical report detailing instances of verbatim generation, the prompts that triggered it, and a statistical analysis of the findings. This report provides citable findings for legal teams and creators.6

Source Notes

  1. S-Square Research prompting study (ver. 2025.06) summarized in "Verbatim Evidence Yield."
  2. Carlini et al., "Extracting Training Data from Large Language Models," IEEE S&P (2021). Source
  3. S-Square Research, "Prompt Engineering Comparative Study."
  4. Carlini et al., "Quantifying Memorization Across Neural Language Models," ICLR (2023). Source
  5. U.S. Copyright Office, "Title 17, Chapter 12 — Copyright Protection and Management Systems."
  6. American Bar Association, "Generative Artificial Intelligence: Copyright Considerations."
  7. U.S. Copyright Office, “AI-generated material registration guidance.”
  8. Authors Guild v. Google, 804 F.3d 202 (2d Cir. 2015). “Opinion.”