All Collections
Extracting data
Extract metadata from emails and documents with Metadata fields
Extract metadata from emails and documents with Metadata fields

How to use Metadata Fields to automatically add document metadata to your parsed result

Updated over a week ago

Let's see how to include metadata, such as received time, sender, subject, etc. in your parsed data and which metadata is available in Parseur.

Using Meta Fields to extract metadata

In Parseur, metadata fields are called "Metadata fields". This is as opposed to the "Custom fields" that you make when creating templates.

You can add Metadata fields in 2 ways:

Option 1: Through the Fields menu section

  1. Open your Parseur mailbox

  2. Click on the Fields section on the left-hand side menu

  3. This section will list all available metadata fields aside from your custom fields

  4. Click on the meta fields you need. You can also mouse over them to get more information about the metadata field.

Option 2: Through the template editor

  1. Open your Parseur mailbox

  2. Edit any template

  3. Click on the Metadata tab, at the right of the document

  4. Click on the fields you need.

Note 1: change of metadata fields will only appear in newly parsed documents. To view those fields in existing parsed documents, you will need to reprocess your documents. To do so, head over to the document queue and use one of the reprocess buttons to re-run the parsing. You can either reprocess a single document or all documents at once.

Note 2: metadata fields selection are global to a mailbox. Even if you select metadata fields using option 2 from the template editor, they will appear in every parsed documents.

List of available metadata fields in Parseur

Parseur can parse different types of document metadata.

Date and Time metadata

  • Received: date and time when Parseur received the document

  • ReceivedDate: date when Parseur received the document

  • ReceivedTime: time of the day when Parseur received the document

Note: These fields are formatted according to your Date and Time formatting preferences. Head over to your User Preferences to review and update them.

Email address metadata

  • Sender: the email address that sent the email to Parseur. This is usually the same address as the OriginalRecipient address, unless your mailbox receives emails from different aliases (or is a catch-all). The name associated with this email, if any, is put into the SenderName field below.

  • SenderName: the name of the person who sent the email to Parseur. It's extracted from the From field in the original email, discarding the associated email address that can be found in the Sender field above.

  • Recipient: the email address that receives the email. It is your Parseur mailbox address (in the form [email protected])

  • To: the "To" field of the email. The "To" field can contain several email addresses.

  • CC: the "CC" field of the email. The "CC" field can contain several email addresses.

  • BCC: the "BCC" field of the email. You can only see this field if you're the one being BBCed (you cannot see if other people are BBCed, by design)

  • ReplyTo: the email address to reply to (if set)

  • RecipientSuffix: the recipient suffix (or alias suffix) that you used. Say you have created a mailbox named [email protected]. You can send emails to [email protected] or [email protected] and all emails will land in the same mailbox. When you use such aliases, the RecipientSuffix field contains what is after the + (for example test123 and id456 in the examples given before). This is particularly useful if you forward emails from different sources and want to know which source sent what email.

  • OriginalRecipient: the email address that receives the email before forwarding it to Parseur. Note: this will only work after you set up automatic forwarding of your emails (it will be equal to the Recipient otherwise)

Document Content metadata

  • Subject: the title of the document. Depending on the type of document, this is either: the subject of the email, the filename of the attachment or the URL of the linked web page

  • HtmlDocument: the full content of the document including HTML formatting

  • TextDocument: the full content of the document in Text (excluding any HTML formatting)

  • OriginalDocument: the name, content type, file size and download URL of the original document.

  • LastReply: the content of the last reply in the email chain (in plain text). Note: this field is limited to English text replies without forward heads, and is currently tested on the following email platforms: Yahoo, iCloud, Gmail, Outlook.com, iOS Mail, Apple Mail, Microsoft Outlook (Windows & Mac), and Mozilla Thunderbird. Parseur makes a “best attempt” to parse all inbound replies. We also cannot parse HTML email parts of replies to populate this field — it will only be applicable when there is a plain text email part in the reply.

  • Attachments: a list of all documents attached to the email along with the URLs to retrieve them.

  • Headers: an object containing all the email headers' names and values. This is raw technical data from the underlying SMTP email protocol. Here you'll find data such as In-Reply-To or Message-ID and many others. Use this field if you can't find the metadata you are looking for in another metadata field.

These fields are useful if you set up a trigger for when a document cannot be parsed. This way, not only can you get a real time notification when a document parsing fails, but you can also check the title and content of the document without having to log onto Parseur.

Parseur-specific metadata

  • DocumentID: a unique ID that identifies the document in Parseur

  • ParentID: ID of the parent of the document (if any). For example, if you send an email with attachments, the attachment ParentID will be the email DocumentID.

  • DocumentURL: a link to the document in Parseur App. Useful if you have an integration where you want to be able to quickly open the app and check the document. This link will redirect to Parseur App and hence requires you to be authenticated with Parseur to access it.

  • PublicDocumentURL: a public link to the Document. You need to be careful when sharing this link as anyone with the link can access your document without any authentication.

  • Template: the name of the Parseur template that was used to parse the document.

What's Next?

Did this answer your question?