ConvertExcelToCSVProcessor

Description:

Consumes a Microsoft Excel document and converts each worksheet to csv. Each sheet from the incoming Excel document will generate a new Flowfile that will be output from this processor. Each output Flowfile's contents will be formatted as a csv file where the each row from the excel sheet is output as a newline in the csv file. This processor is currently only capable of processing .xlsx (XSSF 2007 OOXML file format) Excel documents and not older .xls (HSSF '97(-2007) file format) documents. This processor also expects well formatted CSV content and will not escape cell's containing invalid content such as newlines or additional commas.

Tags:

excel, csv, poi

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

NameDefault ValueAllowable ValuesDescription
Sheets to ExtractComma separated list of Excel document sheet names that should be extracted from the excel document. If this property is left blank then all of the sheets will be extracted from the Excel document. The list of names is case in-sensitive. Any sheets not specified in this value will be ignored.
Supports Expression Language: true

Relationships:

NameDescription
successExcel data converted to csv
failureFailed to parse the Excel document
originalOriginal Excel document received by this processor

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
sheetnameThe name of the Excel sheet that this particular row of data came from in the Excel document
numrowsThe number of rows in this Excel Sheet
sourcefilenameThe name of the Excel document file that this data originated from
convertexceltocsvprocessor.errorError message that was encountered on a per Excel sheet basis. This attribute is only populated if an error was occured while processing the particular sheet. Having the error present at the sheet level will allow for the end user to better understand what syntax errors in their excel doc on a larger scale caused the error.

State management:

This component does not store state.

Restricted:

This component is not restricted.