Handwriting no model has seen
Old Norse, mixed Devanagari, eighteenth-century legal hands. Off-the-shelf OCR collapses on lines a reviewer reads in seconds.
Libraries, archives and genealogy publishers turn to us for the corpora that defeat commercial OCR — multilingual hands, faded ink, marginalia. Domain-trained models read what they can; reviewers train the model on what they can’t.
The model is the easy part. The reviewer loop is the deliverable.
Six structural problems we hear repeatedly from publishers, libraries and research programmes — and the honest answer to each.
Old Norse, mixed Devanagari, eighteenth-century legal hands. Off-the-shelf OCR collapses on lines a reviewer reads in seconds.
Even domain-trained, models top out near 80% on hard hands. The 95% number is earned through reviewer corrections, not assumed.
Transcriptions face scholarly review. Rights chains and embargoes have to hold through the publishing pipeline.
Migration without dead inbound citations or losing the catalogue's existing identifiers.
The catalogue ships on a schedule. SLAs by script, language and quality band — not aspirational.
The reviewer is an archivist or paleographer. Tooling speaks their vocabulary; their corrections retrain the model.
Three workstreams that run together. The output is one audited corpus — not a stack of pilots.
A reviewer-grade indexing platform built around the archivist's workflow — not the engineer's.
Domain-trained HWR gets to ~80% on hard hands. The 95% number comes from the active-learning loop: every reviewer correction is training data, retraining on a weekly cadence, with eval gates between runs.
Outputs flow into your library system and repository — without breaking the citations external scholars already use.
The Actigen module for libraries, archives and publishers. Reviewer-led, retraining weekly, shipping audit-grade output.
View the platformA complete archives-and-publishing pipeline. Configurable per script, period and schema. Ships with the reviewer console, model registry, audit pack and catalogue integrations already wired in.
Six briefs we are running this year — each anchored in a specific corpus and reviewer audience.
Parish registers, civil registries, church books — converted into searchable, citation-grade records.
Hansards, debates, bills and committee proceedings — searchable across script and language reforms.
Variant collation, marginalia, footnote linking — structured for digital editions with TEI export.
Article-level segmentation, byline extraction, topic indexing — published into reader and search products.
Codices, charters, illuminated manuscripts. Conservation-aware capture, paleographer-led labelling.
Continuous indexing for subscription publishers — with throughput SLAs and quality-band reporting.
Three outcomes consistently reported. Per-engagement targets agreed in writing during Discover.
Once the model clears its eval gate, reviewer hours shift from transcription to verification — and corrections retrain the model.
Unit cost drops as the active-learning loop closes. Most publishers see meaningful reduction within the first quarter.
Additional corpora — new scripts, new periods — onboard in weeks. The infrastructure is reusable; only the domain model is new.
Numbers that map to how libraries, publishers and research programmes report up — boards, funders, subscribers.
Four engagements where the corpus shipped, the lineage held, and the citations survived scholarly review.
No commercial model was trained on the handwriting. We engaged a Norwegian genealogist to label, fine-tuned a domain HWR model, and wrapped it in a custom indexing platform with reviewer-grade HITL. The accuracy threshold was cleared, not approached.
A productised heritage operating model for the world's most significant botanical archives. High-resolution capture, transcription of historical scripts, metadata indexing, and global scientific accessibility — end-to-end. Hand-signed Darwin sheets included.
Actigen Archive transformed thousands of legacy parts catalogues into a unified, searchable corpus — collapsing lookup time and enabling new commercial channels.
Actigen Research applied domain-tuned handwritten-record AI to centuries of maritime documents — extracting names, ports, cargoes and dates with auditable lineage.
Actigen 2.0 is the agentic decision-loop framework underneath every engagement. Configurable per industry; consistent in lineage, governance and audit posture. Below: the full platform, with the publishing module highlighted.
Libraries, archives, publishers. HWR, IIIF, MARC / EAD / METS, paleography console.
/ A·fBFSI. KYC, claims triage, contract abstraction with audit-pack lineage.
/ A·hHealthcare. Clinical-note summarisation, IDMP, FHIR-mapped knowledge assistants.
/ A·eOil & gas, utilities. CSRD / SEC Climate drafting, asset-document agents, HSE.
/ A·mEngineering knowledge, supplier IP, IATF audit drafting and CAD-aware retrieval.
/ A·uNAAC narratives, ABET evidence packs, credential registries and reviewer copilots.
/ A·rAcademic research operations — TEI, scholarly editions, critical apparatus, IIIF.
/ ePLegislatures, registries, parliaments. Multi-script archives and lawmaking copilots.
"The same loop runs every brief. Only the domain model and the schema mapping change."
View the platformExternally verifiable credentials and the standards every output is engineered to satisfy. The audit pack travels with the system.
External audits aligned to ISO 27001 and 27701. Privacy extends to reviewer roles, embargoes and rights-bearing material.
Copyright, embargoes and accessibility — handled in the platform, not in policy documents.
The standards your catalogue, repository and subscribers expect — implemented natively.
“The model alone wouldn't have got us there. What worked was the loop — our reviewers correcting, the system retraining, accuracy climbing release after release. By the third quarter we were publishing on schedule.“Editorial Director A genealogy publisher · Norwegian HWR programme
Send a folio. Receive a measured pilot proposal within the working week.
We use essential cookies to make this site work, and optional cookies to understand how you use it. You can accept all, reject non-essential, or choose what to allow.