FuzzyHashContent

Description:

Calculates a fuzzy/locality-sensitive hash value for the Content of a FlowFile and puts that hash value on the FlowFile as an attribute whose name is determined by the <Hash Attribute Name> property.Note: this processor only offers non-cryptographic hash algorithms. And it should be not be seen as a replacement to the HashContent processor.Note: The underlying library loads the entirety of the streamed content into and performs result evaluations in memory. Accordingly, it is important to consider the anticipated profile of content being evaluated by this processor and the hardware supporting it especially when working against large files.

Tags:

hashing, fuzzy-hashing, cyber-security

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

NameDefault ValueAllowable ValuesDescription
Hash Attribute Namefuzzyhash.valueThe name of the FlowFile Attribute that should hold the Fuzzy Hash Value
Hashing Algorithm
  • ssdeep Uses ssdeep / SpamSum 'context triggered piecewise hash'.
  • tlsh Uses TLSH (Trend 'Locality Sensitive Hash'). Note: FlowFile Content must be at least 512 characters long
The hashing algorithm utilised

Relationships:

NameDescription
successAny FlowFile that is successfully hashed will be sent to this Relationship.
failureAny FlowFile that is successfully hashed will be sent to this Relationship.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
<Hash Attribute Name>This Processor adds an attribute whose value is the result of Hashing the existing FlowFile content. The name of this attribute is specified by the <Hash Attribute Name> property

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

See Also:

CompareFuzzyHash, HashContent