Dynamic OCR: advanced document data extraction

Dynamic OCR is our most advanced technique to extract data points from documents. It can dynamically find fields that move or change size from one document to the next.

When do you need Dynamic OCR?

Use Dynamic OCR when you have documents with a similar layout but with fields that can either move or change size.

Field position is variable

Tables or optional fields are the main reason leading to a layout shift that causes fields to move in a document.

After Tables field

Tables with a variable number of rows often lead to a shift of the layout below them.

After optional fields

Optional information such as an "address 2" line in an address or a customization option like the size or the color of an item can also lead to a shift of all information below it.

Field size is variable

Often, your fields are constrained to a fixed area on the document. But sometimes they are not, like for multi-line comments or multi-row tables.

Variable number of table rows

Because the number of rows in tables can change, the size of the table field will change too. You need to be able to tell the tool where to stop the table.

Variable number of lines

Fields that capture free-form text like comments or notes can span a different number of lines. Your data extraction tool needs to be able to understand where the field stops.

How does Dynamic OCR work?

Dynamic OCR introduces the concept of labels. A label is a piece of text on the document that will serve as an anchor for positioning your field. Instead of having your field at a fixed position on the page like in Zonal OCR, Parseur will first locate the label and then will use the label's position to locate the field relative to it.

1

Create a Label

Draw a box over a piece of text you want to use as an anchor and click Create Label.
2

Create a field and make it relative to the label

Draw a box over the data you want to capture, name the field and select the label you created in the options to tell Parseur that the field position is not fixed but relative to the label position.
3

Create a closing label (optional)

If your field can change in size, you can also create a second label below the field and use it as a closing label. Parseur will use your first label to find the start of the field and the second label to determine the end.

Repeat

As simple as that! Repeat the operation for every field and you're done. Different fields can share the same labels. Labels, like fields, can be declared optional or mandatory.

Differences between Dynamic OCR and AI OCR

Maybe you came across the term AI OCR or OCR powered by Machine Learning (ML). Those engines use neural networks to locate fields and extract data. Parseur uses Machine Learning and NLP models only during the first OCR stage when decoding characters on a document. After that, we don't use ML anymore as we find the downsides of AI OCR outweight its upsides. Here is why.

Dynamic OCR

Immediate setup

Ready to parse your data right after the first sample is uploaded and the template is created.

Parse any content

Works on any type of document content, industry or language. Our customers use Dynamic OCR for a wide range of documents and come from all around the world using documents in many different languages.

Blazing fast

Once the initial OCR is done, documents are usually processed in less than 1 second!

Instant data parsing and export

Extracted data is usually of high quality and can be sent directly to other systems downstream for further processing. Some customers require us to provide data seconds after it is received, without human intervention.

Easy to maintain and troubleshoot

When a document isn't processed, the tool will tell you exactly what label or field wasn't extracted and why. So, you will be able to add fixes quickly.
⚠️

Not likely to automatically parse a brand new layout

This is arguably the only downside of Dynamic OCR. If new documents have a layout the tool hasn't seen before, it will likely not be able to parse them and will ask you to create a new template.

AI OCR

⚠️

Requires training

Requires dozens or hundreds of sample documents along with manual inputs before yielding reliable results.
⚠️

Restricted to specific verticals

Pre-trained AIs are limited to specific verticals (like invoices or receipts) and languages. If your documents don't fit a pre-trained model, you will need to upload many annotated samples before getting reliable results.
⚠️

Average performances

Machine Learning algorithms are power-hungry with complex computational models that usually take time to process.
⚠️

Manual review required

Because AIs are probabilistic, AI-based tools often recommend you add a manual data review step (aka human-in-the-loop process). Not only does this make the process slower but it also leads to much higher costs of operation.
⚠️

Black box

Machine Learning models are a black box. When it works, it's great. When it fails, there is nothing much you can do besides manually annotating the document, retraining the model and hoping for the best.

Can parse brand new layouts

AI OCR can theoretically adapt to brand new layouts. That being said, data extraction accuracy will depend on the particular AI model and may still require manual inputs.

The best Intelligent Document Processing software

Dynamic OCR along with the rest of our data extraction features make Parseur the most versatile data extraction platform for documents.

Best-in-class OCR software

Parseur OCR accuracy is the best on the market. It supports most languages, including handwritten, and is blazingly fast.

Powerful template engine

Extract data from various layouts by creating multiple templates and using automatic layout detection.

Zonal OCR

With Zonal OCR, extract text from fields that are at a fixed position on every similar documents.

All-in-one data extraction software. Start using Parseur today.

Automate text extraction from emails, PDFs and spreadsheets.
Save hundreds of hours of manual work.
Embrace work automation.

Sign up for free
Parseur rated 5/5 on Capterra
Parseur.com is most likely to be recommended by users on G2
Parseur.com has the happiest users badge on Crozdesk
Parseur rated 5/5 on GetApp