QA/QC9 min read · March 2026

How to Check PDF Title Blocks Against File Metadata in Autodesk Forma

Title block mismatches are one of the most common audit findings on construction projects. Here's how to catch them automatically.

The mismatch problem nobody talks about

A PDF drawing's title block says "Rev C." The same file's metadata in Autodesk Forma says "Rev B." The drawing number printed in the title block doesn't match the filename convention. The project name in the cartouche is misspelled. These mismatches are invisible until someone opens every PDF and manually compares — which almost never happens.

The consequences are real: audit failures during information exchanges, incorrect documents issued for construction, rework when site teams build from superseded drawings, and hours of remediation when discrepancies are finally discovered at handover.

Audit failures

Mismatched metadata flagged during ISO 19650 audits and information exchanges.

Wrong documents on site

Teams build from superseded drawings they thought were current.

Costly rework

Discovering mismatches at handover means going back through hundreds of files.

No systematic checks

Nobody opens every PDF to compare. It's too slow to do manually.

Why title blocks and metadata drift apart

The root cause is simple: different people update different things at different times. The design team updates the PDF drawing in their CAD application. The document controller updates the file's metadata attributes in Forma. There is no synchronisation mechanism between the two — the text printed inside the PDF and the attributes stored in Forma are completely independent data stores.

Scenario Title Block Says Forma Metadata Says What Goes Wrong
Revision lag Rev C Rev B Document controller didn't update the Revision attribute after the drafter reissued the PDF.
Drawing number mismatch STR-DWG-0042 STR-DWG-042.pdf Leading zeros differ between the CAD template and the file naming convention. Search results miss the file.
Status desync S2 — Suitable for Information S3 — Suitable for Review Suitability was updated in Forma but the title block still shows the old code. Wrong distribution list used.

These mismatches are not bugs in Forma. The platform was never designed to read PDF content and compare it to file attributes. That cross-validation layer has to come from outside.

What the manual check looks like

Without automation, checking title blocks against metadata is a painstaking exercise. Here is the typical workflow:

1

Open the PDF in a viewer. Navigate to the title block (usually bottom-right corner of the last page).

2

Read the drawing number, revision, status, date, originator, and any other fields your standard requires.

3

Switch to Forma. Open the file's properties panel. Compare each field against what the title block shows.

4

Log any discrepancy in a spreadsheet with the file name, which fields mismatch, and what the correct value should be.

5

Repeat for every file. Close the PDF, open the next one.

The time cost adds up fast

At roughly 5 minutes per file — open, read, compare, log — checking 200 drawings takes over 16 hours. That is two full working days per review cycle, and most projects have multiple review cycles before handover.

How PDF zone extraction works

The key insight is that title block fields occupy predictable, fixed positions on a drawing sheet. The revision box is always in the same place. The drawing number is always in the same place. If you can tell the system where to look, it can read what's there.

Foreman's zone templates let you define rectangular regions on a PDF page, each targeting a specific title block field. The system extracts text from those zones using two methods:

Native text extraction

For PDFs with embedded text layers (most CAD-exported PDFs), Foreman uses PdfPig to read text directly from the defined zone coordinates. Fast, accurate, and no external dependencies.

OCR for scanned drawings

For scanned or rasterised PDFs without a text layer, Foreman falls back to Tesseract OCR with support for 15 languages. It reads the pixels within the zone and returns the recognized text.

How zones are defined

You draw zones visually on a reference PDF using Foreman's zone editor — a pdf.js canvas with an SVG overlay. Each zone gets a name (e.g., "Drawing Number", "Revision", "Status") and normalised coordinates (0-1 range) so they scale to any page size.

Cross-validation with Content Match rules

Zone extraction gives you the raw text from each title block field. Content Match rules let you define what that text should say — and flag violations when it doesn't match.

Here is how you set up a cross-validation rule in Foreman, step by step:

1

Create a zone template

Upload a reference PDF of your standard drawing sheet. Draw zones around each title block field you want to check — drawing number, revision, status, originator, date, etc. Name each zone clearly.

2

Create a rule set with Content Match rules

Add a new rule set (e.g., "Title Block Validation"). Inside it, create Content Match rules that reference your zones. Each rule specifies a zone name and the expected value pattern.

3

Define the match pattern

Use regex patterns or attribute references. For example: the "Revision" zone must contain the value from the file's Revision custom attribute. Or the "Drawing Number" zone must match the pattern extracted from the filename.

4

Run the check

Select your folders, assign the rule set and zone template, and run. Foreman downloads each PDF, extracts text from the defined zones, and evaluates every Content Match rule. Results show pass/fail per file per rule.

Rule Example Zone Expected Value What It Catches
Revision must match metadata Revision File's Revision attribute value Title block shows Rev C, Forma says Rev B
Drawing number matches filename Drawing Number Regex derived from filename Leading zeros, missing prefixes, wrong project codes
Status code is consistent Status File's Document Status attribute Title block says S2, metadata says S3

Handling different sheet sizes with size variants

A single project often produces drawings on A0, A1, and A3 sheets. The title block is in the same relative position on each size, but the absolute coordinates differ. Without size-awareness, a zone template built for A1 sheets will extract garbage from an A3 drawing.

Foreman solves this with a parent-child template hierarchy:

A1

Parent template

Create your zone template on an A1 reference sheet. Define all the zones once — drawing number, revision, status, originator, etc.

A0

Size variant (larger)

Upload an A0 reference PDF as a variant. Foreman uses anchor-based repositioning to map zones to the new dimensions automatically.

A3

Size variant (smaller)

Upload an A3 reference PDF as another variant. Same zones, different coordinates — all managed under one template.

Smart template matching

When Foreman runs a check, it detects each PDF's page dimensions and automatically selects the best-matching template variant. If page dimensions overlap between sizes, it also scores zone candidates by content to pick the most accurate match. You don't need to sort files by sheet size before running a check.

Validate extracted values against approved lists

Once you're extracting values from title blocks, a natural next question is: "Is this value actually valid?" For example, the suitability code zone says "S2" — is that on the approved list? The drawing number starts with "PROJ-ACE" — is ACE a registered originator?

Foreman's List Validation rule type lets you check extracted PDF zone values against reusable validation lists. Combined with Content Match and Content Convention rules, you get a complete title block QA pipeline: extract, validate format, cross-check against metadata, and verify against approved code lists. Read more: How to Validate ISO 19650 Naming Segments and How to Check Your MIDP/TIDP Register.

Key takeaway

Title block mismatches are one of the most frequently cited findings in document control audits. They persist because the manual check — open every PDF, read the title block, compare to metadata — is too slow to do systematically. Automated zone extraction and Content Match rules turn a 16-hour manual exercise into a scheduled background job that runs every night. Mismatches are flagged before anyone issues a drawing for construction, not discovered months later at handover.

Stop checking title blocks manually

Automated PDF cross-validation is included with the 14-day free trial. No credit card required.

You're offline — some actions may not work.

Connection lost

Attempting to reconnect to Foreman...

Connection lost

Retrying in --s Attempt - of -

Connection interrupted

Retrying in --s Attempt - of -