January update: export last document, pick label from the bottom and more

Portrait of Sylvestre Dupont
by Sylvestre Dupont
2 mins read
last updated on

It is hopefully not too late, so Happy New Year!

It's been quite some time since our last update. Rest assured we did not stay idle. Since releasing our major PDF parsing upgrade, we've worked non-stop tuning and improving our new engine to cater for more and more use cases.

We're starting 2023 with a few small improvements that we hope will make your parsing workflow better.

New: Export last document data only

A few customers asked for a simple way to download the data of the last document they had parsed. Typical use case is: you are receiving a daily report containing with updates. As soon as you receive a new report, the previous one becomes obsolete and hence you only want the data from the freshest document.

We added a "Last document only" option to the Download and Google Sheets exports for this purpose.

New "Last document only" export option

The new "Last document only" option is available in the Export section of your mailbox

New: find labels starting from the bottom of the document

Labels are what is powering our flagship Dynamic OCR feature that lets you extract data fields that move horizontally or vertically in documents.

When creating a label in an OCR Template, Parseur will automatically compute the occurrence and total number of occurrences of that label in the document. Parseur will then use this information to compute the position of the label in case there is more than one occurrence.

Label occurrence is calculated from top of the document by default. Sometimes however, you want to tell Parseur the label should be located from the bottom of the document instead. For example, you want to always take the last occurrence of "Total" in a document even of the total number of occurrences varies from one document to the next.

We added the option to count occurrences from the bottom instead of the top on the label option screen.

New way of position label occurrences from the bottom of the document

In this example, we set the label as the first occurrence of all "Total:" labels counted from the bottom of the document, effectively asking Parseur to always take the last one.

Other improvements and bug fixes

  • We did many updates behind the scenes to correctly handle the strangest and weirdest types of PDFs (PDFs come in all shapes and flavors)
  • The field usage page in your mailbox now includes fields used in OCR templates as well

That's all for this month! As usual, please don't hesitate to share your use cases and feature requests on the chat or on our feedback page directly.

All-in-one data extraction software. Start using Parseur today.

Automate text extraction from emails, PDFs and spreadsheets.
Save hundreds of hours of manual work.
Embrace work automation.

Sign up for free
Parseur rated 5/5 on Capterra
Parseur.com is most likely to be recommended by users on G2
Parseur.com has the happiest users badge on Crozdesk
Parseur rated 5/5 on GetApp