The mismatch problem nobody talks about
A PDF drawing's title block says "Rev C." The same file's metadata in Autodesk Forma says "Rev B." The drawing number printed in the title block doesn't match the filename convention. The project name in the cartouche is misspelled. These mismatches are invisible until someone opens every PDF and manually compares — which almost never happens.
The consequences are real: audit failures during information exchanges, incorrect documents issued for construction, rework when site teams build from superseded drawings, and hours of remediation when discrepancies are finally discovered at handover.
Audit failures
Mismatched metadata flagged during ISO 19650 audits and information exchanges.
Wrong documents on site
Teams build from superseded drawings they thought were current.
Costly rework
Discovering mismatches at handover means going back through hundreds of files.
No systematic checks
Nobody opens every PDF to compare. It's too slow to do manually.
Why title blocks and metadata drift apart
The root cause is simple: different people update different things at different times. The design team updates the PDF drawing in their CAD application. The document controller updates the file's metadata attributes in Forma. There is no synchronisation mechanism between the two — the text printed inside the PDF and the attributes stored in Forma are completely independent data stores.
| Scenario | Title Block Says | Forma Metadata Says | What Goes Wrong |
|---|---|---|---|
| Revision lag | Rev C |
Rev B |
Document controller didn't update the Revision attribute after the drafter reissued the PDF. |
| Drawing number mismatch | STR-DWG-0042 |
STR-DWG-042.pdf |
Leading zeros differ between the CAD template and the file naming convention. Search results miss the file. |
| Status desync | S2 — Suitable for Information |
S3 — Suitable for Review |
Suitability was updated in Forma but the title block still shows the old code. Wrong distribution list used. |
These mismatches are not bugs in Forma. The platform was never designed to read PDF content and compare it to file attributes. That cross-validation layer has to come from outside.
What the manual check looks like
Without automation, checking title blocks against metadata is a painstaking exercise. Here is the typical workflow:
Open the PDF in a viewer. Navigate to the title block (usually bottom-right corner of the last page).
Read the drawing number, revision, status, date, originator, and any other fields your standard requires.
Switch to Forma. Open the file's properties panel. Compare each field against what the title block shows.
Log any discrepancy in a spreadsheet with the file name, which fields mismatch, and what the correct value should be.
Repeat for every file. Close the PDF, open the next one.
The time cost adds up fast
At roughly 5 minutes per file — open, read, compare, log — checking 200 drawings takes over 16 hours. That is two full working days per review cycle, and most projects have multiple review cycles before handover.
How PDF zone extraction works
The key insight is that title block fields occupy predictable, fixed positions on a drawing sheet. The revision box is always in the same place. The drawing number is always in the same place. If you can tell the system where to look, it can read what's there.
Foreman's zone templates let you define rectangular regions on a PDF page, each targeting a specific title block field. The system extracts text from those zones using two methods:
Native text extraction
For PDFs with embedded text layers (most CAD-exported PDFs), Foreman uses PdfPig to read text directly from the defined zone coordinates. Fast, accurate, and no external dependencies.
OCR for scanned drawings
For scanned or rasterised PDFs without a text layer, Foreman falls back to Tesseract OCR with support for 15 languages. It reads the pixels within the zone and returns the recognized text.
How zones are defined
You draw zones visually on a reference PDF using Foreman's zone editor — a pdf.js canvas with an SVG overlay. Each zone gets a name (e.g., "Drawing Number", "Revision", "Status") and normalised coordinates (0-1 range) so they scale to any page size.
Cross-validation with Content Match rules
Zone extraction gives you the raw text from each title block field. Content Match rules let you define what that text should say — and flag violations when it doesn't match.
Here is how you set up a cross-validation rule in Foreman, step by step:
Create a zone template
Upload a reference PDF of your standard drawing sheet. Draw zones around each title block field you want to check — drawing number, revision, status, originator, date, etc. Name each zone clearly.
Create a rule set with Content Match rules
Add a new rule set (e.g., "Title Block Validation"). Inside it, create Content Match rules that reference your zones. Each rule specifies a zone name and the expected value pattern.
Define the match pattern
Use regex patterns or attribute references. For example: the "Revision" zone must contain
the value from the file's Revision custom attribute.
Or the "Drawing Number" zone must match the pattern extracted from the filename.
Run the check
Select your folders, assign the rule set and zone template, and run. Foreman downloads each PDF, extracts text from the defined zones, and evaluates every Content Match rule. Results show pass/fail per file per rule.
| Rule Example | Zone | Expected Value | What It Catches |
|---|---|---|---|
| Revision must match metadata | Revision |
File's Revision attribute value | Title block shows Rev C, Forma says Rev B |
| Drawing number matches filename | Drawing Number |
Regex derived from filename | Leading zeros, missing prefixes, wrong project codes |
| Status code is consistent | Status |
File's Document Status attribute | Title block says S2, metadata says S3 |
Handling different sheet sizes with size variants
A single project often produces drawings on A0, A1, and A3 sheets. The title block is in the same relative position on each size, but the absolute coordinates differ. Without size-awareness, a zone template built for A1 sheets will extract garbage from an A3 drawing.
Foreman solves this with a parent-child template hierarchy:
Parent template
Create your zone template on an A1 reference sheet. Define all the zones once — drawing number, revision, status, originator, etc.
Size variant (larger)
Upload an A0 reference PDF as a variant. Foreman uses anchor-based repositioning to map zones to the new dimensions automatically.
Size variant (smaller)
Upload an A3 reference PDF as another variant. Same zones, different coordinates — all managed under one template.
Smart template matching
When Foreman runs a check, it detects each PDF's page dimensions and automatically selects the best-matching template variant. If page dimensions overlap between sizes, it also scores zone candidates by content to pick the most accurate match. You don't need to sort files by sheet size before running a check.
Validate extracted values against approved lists
Once you're extracting values from title blocks, a natural next question is: "Is this value actually valid?" For example, the suitability code zone says "S2" — is that on the approved list? The drawing number starts with "PROJ-ACE" — is ACE a registered originator?
Foreman's List Validation rule type lets you check extracted PDF zone values against reusable validation lists. Combined with Content Match and Content Convention rules, you get a complete title block QA pipeline: extract, validate format, cross-check against metadata, and verify against approved code lists. Read more: How to Validate ISO 19650 Naming Segments and How to Check Your MIDP/TIDP Register.
Key takeaway
Title block mismatches are one of the most frequently cited findings in document control audits. They persist because the manual check — open every PDF, read the title block, compare to metadata — is too slow to do systematically. Automated zone extraction and Content Match rules turn a 16-hour manual exercise into a scheduled background job that runs every night. Mismatches are flagged before anyone issues a drawing for construction, not discovered months later at handover.