ExtractRecordSchema

Description:

Extracts the record schema from the FlowFile using the supplied Record Reader and writes it to the `avro.schema` attribute.

Tags:

record, generic, schema, json, csv, avro, freeform, text, xml

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Record Readerrecord-readerController Service API:
RecordReaderFactory
Implementations: GrokReader
JsonTreeReader
WindowsEventLogReader
ReaderLookup
ParquetReader
CSVReader
Syslog5424Reader
ExcelReader
CEFReader
XMLReader
ScriptedReader
SyslogReader
JsonPathReader
AvroReader
YamlTreeReader
Specifies the Controller Service to use for reading incoming data
Schema Cache Sizecache-size10Specifies the number of schemas to cache. This value should reflect the expected number of different schemas that may be in the incoming FlowFiles. This ensures more efficient retrieval of the schemas and thus the processor performance.

Relationships:

NameDescription
successFlowFiles whose record schemas are successfully extracted will be routed to this relationship
failureIf a FlowFile's record schema cannot be extracted from the configured input format, the FlowFile will be routed to this relationship

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
record.error.messageThis attribute provides on failure the error message encountered by the Reader.
avro.schemaThis attribute provides the schema extracted from the input FlowFile using the provided RecordReader.

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.