CompareFuzzyHash

Description:

Compares an attribute containing a Fuzzy Hash against a file containing a list of fuzzy hashes, appending an attribute to the FlowFile in case of a successful match.

Tags:

hashing, fuzzy-hashing, cyber-security

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

NameDefault ValueAllowable ValuesDescription
Hash List source filePath to the file containing hashes to be validated against
Hashing Algorithm
  • ssdeep Uses ssdeep / SpamSum 'context triggered piecewise hash'.
  • tlsh Uses TLSH (Trend 'Locality Sensitive Hash'). Note: FlowFile Content must be at least 512 characters long
The hashing algorithm utilised
Hash Attribute Namefuzzyhash.valueThe name of the FlowFile Attribute that should hold the Fuzzy Hash Value
Match thresholdThe similarity score must exceed or be equal to in order formatch to be considered true. Refer to Additional Information for differences between TLSH and SSDEEP scores and how they relate to this property.
Matching modesingle
  • single Send FlowFile to matched after the first match above threshold
  • multi-match Iterate full list of hashes before deciding to send FlowFile to matched or unmatched
Defines if the Processor should try to match as many entries as possible (multi-match) or if it should stop after the first match (single)

Relationships:

NameDescription
failureAny FlowFile that cannot be matched, e.g. (lacks the attribute) will be sent to this Relationship.
not-foundAny FlowFile that cannot be matched to an existing hash will be sent to this Relationship.
foundAny FlowFile that is successfully matched to an existing hash will be sent to this Relationship.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
XXXX.N.matchThe match that resembles the attribute specified by the <Hash Attribute Name> property. Note that: 'XXX' gets replaced with the <Hash Attribute Name>
XXXX.N.similarityThe similarity score between this flowfileand its match of the same number N. Note that: 'XXX' gets replaced with the <Hash Attribute Name>

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

See Also:

FuzzyHashContent