ExtractEmailHeaders

Description:

Using the flowfile content as source of data, extract header from an RFC compliant email file adding the relevant attributes to the flowfile. This processor does not perform extensive RFC validation but still requires a bare minimum compliance with RFC 2822

Tags:

split, email

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Additional Header ListCAPTURED_HEADERSx-mailerCOLON separated list of additional headers to be extracted from the flowfile content.NOTE the header key is case insensitive and will be matched as lower-case. Values will respect email contents.
Email Address ParsingSTRICT_ADDRESS_PARSINGStrict Address Parsing
  • Strict Address Parsing Strict email address format will be enforced. FlowFiles will be transfered to the failure relationship if the email address is invalid.
  • Non-Strict Address Parsing Accept emails, even if the address is poorly formed and doesn't strictly comply with RFC Validation.
If "strict", strict address format parsing rules are applied to mailbox and mailbox list fields, such as "to" and "from" headers, and FlowFiles with poorly formed addresses will be routed to the failure relationship, similar to messages that fail RFC compliant format validation. If "non-strict", the processor will extract the contents of mailbox list headers as comma-separated values without attempting to parse each value as well-formed Internet mailbox addresses. This is optional and defaults to Strict Address Parsing

Relationships:

NameDescription
successExtraction was successful
failureFlowfiles that could not be parsed as a RFC-2822 compliant message

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
email.headers.bcc.*Each individual BCC recipient (if available)
email.headers.cc.*Each individual CC recipient (if available)
email.headers.from.*Each individual mailbox contained in the From of the Email (array as per RFC-2822)
email.headers.message-idThe value of the Message-ID header (if available)
email.headers.received_dateThe Received-Date of the message (if available)
email.headers.sent_dateDate the message was sent
email.headers.subjectSubject of the message (if available)
email.headers.to.*Each individual TO recipient (if available)
email.attachment_countNumber of attachments of the message

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.