SplitJson

Description:

Splits a JSON File into multiple, separate FlowFiles for an array element specified by a JsonPath expression. Each generated FlowFile is comprised of an element of the specified array and transferred to relationship 'split,' with the original file transferred to the 'original' relationship. If the specified JsonPath is not found or does not evaluate to an array element, the original file is routed to 'failure' and no files are generated.

Tags:

json, split, jsonpath

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
JsonPath ExpressionJsonPath ExpressionA JsonPath expression that indicates the array element to split into JSON/scalar fragments.
Null Value RepresentationNull Value Representationempty string
  • empty string
  • the string 'null'
Indicates the desired representation of JSON Path expressions resulting in a null value.
Max String LengthMax String Length20 MBThe maximum allowed length of a string value when parsing the JSON document

Relationships:

NameDescription
failureIf a FlowFile fails processing for any reason (for example, the FlowFile is not valid JSON or the specified path does not exist), it will be routed to this relationship
originalThe original FlowFile that was split into segments. If the FlowFile fails processing, nothing will be sent to this relationship
splitAll segments of the original FlowFile will be routed to this relationship

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
fragment.identifierAll split FlowFiles produced from the same parent FlowFile will have the same randomly generated UUID added for this attribute
fragment.indexA one-up number that indicates the ordering of the split FlowFiles that were created from a single parent FlowFile
fragment.countThe number of split FlowFiles generated from the parent FlowFile
segment.original.filename The filename of the parent FlowFile

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

ResourceDescription
MEMORYThe entirety of the FlowFile's content (as a JsonNode object) is read into memory, in addition to all of the generated FlowFiles representing the split JSON. If many splits are generated due to the size of the JSON, or how the JSON is configured to be split, a two-phase approach may be necessary to avoid excessive use of memory.