Flat File Connector

Version 20.0.7396


Flat File Connector



Flat File Connectors convert flat file formats into XML and vice versa.

Overview

Each Flat File Connector is configured with a specific flat file format to convert to XML or to generate out of XML. The Flat File Connector has two primary modes:

  • Position Delimited
  • Character Delimited

For Position Delimited flat files, the connector is configured with an arbitrary set of field names, indices (i.e. positions), and lengths indicating where data appears in each line of the flat file.

For Character Delimited flat files, the connector is configured with the character that separates field values in the flat file.

More details on configuring the flat file format can be found in the Defining the Flat File Format section.

The Flat File Connector supports defining multiple different types of lines in the flat file. For example, a flat file may have a ‘header’ line representing Purchase Order data, and multiple ‘item’ lines representing the Line Items in the order.

The key to defining multiple line types is to specify the Control Field; the control field value is what determines the type of a specific line in the flat file data (for example, a header line might have a control field value of ‘HEAD’ while an item line has a control field value of ‘ITEM’). More details on configuring multiple line types can be found in the Multiple Line Types section.

After the flat file format is configured, the connector will convert files that match this format into XML. The resulting XML structure is explained in the XML Format section. The Flat File Connector can also translates XML that fits this structure into a flat file according to the defined flat file format.

Connector Configuration

Settings Tab

Configuration

Settings related to the core configuration of the connector.

  • File Type Position Delimited - the fields in flat file appear at a specific position within each line.
    Character Delimited - the fields in the flat file are separated by a specific character, for example comma-separated values.
  • Delimiter If File Type is set to Character Delimited, this is the character that separates individual fields in the flat file.

Control Field

Settings related to the Control Field, which determines the different types of lines defined in the flat file format.

  • Multi-line Mode Whether the flat file format includes multiple types of lines. More information can be found in the Multiple Line Types section.
  • Start Index If File Type is Position Delimited and Multi-line Mode is enabled, this value is the index in the line where the Control Field begins. For example, if the first field in a line defines the type of line (i.e. the first field is the control field), then the Start Index would be 0.
  • Field IndexIf File Type is Character Delimited and Multi-line Mode is enabled, this value is the index of the Control Field in the line (indices start at 0). For example, if if the second field of a line defines the type of line (i.e. the second field is the control field), then the Field Index would be 1.
  • Column Headers Present If Multi-line Mode is disabled, this determines whether the first line of a flat file should be interpreted as column headers (i.e. the names of each field rather than actual data). Similarly, enabling this setting causes the connector to generate a header row when converting from XML to flat file.
  • Use Auto Generated Column Names If File Type is Character Delimited and Multi-line Mode is disabled, this determines whether the connector will generate generic field names. Keep this setting disabled to manually specify the field names in the Line Type section.

Line Types

This section allows for defining the field names and positions (if Line Type is Position Delimited) in the flat file format. More details on defining the flat file format can be found in the Defining the Flat File Format section.

Multiple line types can be defined if Multi-line Mode is enabled by using the Add Line Type button. Each line type has a Control Field Value, which is used to identify the line type. For example, a header line may have the control field value of ‘HEAD’ while an item line has the control field value of ‘ITEM’.

Advanced

  • Padding Character When creating a flat file and the field value does not fill the entire field length, this character will be used to fill the rest of the field.
  • Invalid XML Name Prefix Some field names are not valid names for XML elements (for example, fields that begin with a number like ‘123ABC’), so a prefix must be applied to generate XML from the flat file. Similarly, when translating from XML to flat file, the connector will look for this prefix and remove it.
  • Pad on EOL By default, the connector will throw an error if it encounters an unexpected end-of-line. When enabled, this setting tells the connector to pad out the remainder of the line instead of throwing an error.
  • Log Messages Whether the log entry for a processed file will include a copy of the file itself.
  • Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.

Local Folders

Settings that determine the folder on disk that files will be processed from, and where they will be placed after processing.

  • Input Folder (Send) The connector can process files placed in this folder. If Send Automation is enabled, the connector will automatically poll this location for files to process.
  • Output Folder (Receive) After the connector finishes processing a file, the result will be placed in this folder. If the connector is connected to another connector in the flow, files will not remain here and will instead be passed along to the Input/Send folder for the connected connector.
  • Processed Folder (Sent) After processing a file, the connector will place a copy of the processed file in this folder if Save to Sent Folder is enabled. This copy of the file will not be passed along to the next connector in the flow.

Automation Settings

  • Send Whether files arriving at the connector will automatically be processed.

Defining the Flat File Format

The first step in configuring a Flat File Connector is defining the format of the flat file. This section describes formats that have a single line type, or in other words, each line in the flat file will have the same set of fields. For flat files with multiple different line types, please see the Multiple Line Types section.

Character Delimited Single-line Format

For Character Delimited flat files, defining the format is simple: specify the character that separates different fields in the flat file via the Delimiter property.

The Column headers present field indicates whether the first line of the flat file is a header line; in other words, it contains the names of fields rather than actual data. If these column headers are present, the connector will use the header names as the names of XML elements in the resulting translated XML. Similarly, the connector will use the XML element names to generate a header row when translating from XML to flat file.

If column headers are not present, the connector supports manually specifying the names of each field by adding the fields in the Line Type section. These field names are applied in index order, meaning that the first entry in Line Type will be the name of the first field in the flat file line, and so on.

The connector can also auto-generate generic field names by enabling Use auto generated field names.

Position Delimited Single-line Format

For Position Delimited flat files, defining the format requires specifying the position of each field in the format. The Line Type section in the connector settings provides an interface for adding an arbitrary number of fields that are present in each line of the flat file. Each field must be identified by a name and the position in the file it appears.

The Column headers present field indicates whether the first line of the flat file is a header line; in other words, it contains the names of fields rather than actual data. The field names must still be configured in the Line Type section, and this setting simply helps ensure that a header row is not interpreted as real data.

Multiple Line Types

If the flat file format contains multiple types of lines, the Multi-line Mode property should be enabled. The field in the flat file that identifies the line type is known as the Control Field.

Character Delimited Multi-line Format

When the File Type is Character Delimited, the Field Index setting determines where the Control Field appears in each line of the flat file. This index starts at 0, meaning that if the Control Field is the 5th value in the line, the Field Index should be 4.

For each possible value that might appear in the Control Field, define a new line type by clicking the Add Line Type button. The value that identifies the line type should be set in the Control Field Value for that line.

Once each possible line type has been added and identified via a specific Control Field Value, the fields that will appear in each line type should be specified in index order.

Position Delimited Multi-line Format

When the File Type is Position Delimited, the Start Index setting determines the position where the Control Field appears (begins) in each line of the flat file. This index starts at 0, meaning that if the Control Field begins at the 15th character in the line, the Field Index should be 14.

For each possible value that might appear in the Control Field, define a new line type by clicking the Add Line Type button. The value that identifies the line type should be set in the Control Field Value for that line.

Once each possible line type has been added and identified via a specific Control Field Value, the fields that will appear in each line type should be specified along with the position in the line that they appear (begin).

Multi-line Example

For example, say that a flat file contains two types of lines, a shipment line type and a package line type. The shipment line contains information on the date, time, and addresses of shipment, and the package line type contains information on the items being shipped.

The shipment line type may have a Control Field Value of ‘SHIP’, and the package line type may have a Control Field Value of ‘PCKG’. The first field in each line is either ‘SHIP’ or ‘PCKG’ to indicate what type of line it is.

To handle this case, the Multi-line mode should be enabled, and the Field Index (or Start Index) should be set to 0 to indicate that the Control Field is the first field in the line. Then, there should be two line types configured in the Line Types section; one with a Control Field Value of ‘SHIP’, which contains each field in a shipment line (e.g. ShipDate, DeliveryDate, ShipToAddress, etc), and one with a Control Field Value of ‘PCGK’, which contains each field in a package line (e.g. ItemName, ItemWeight, etc).

XML Format

After a flat file is converted to XML, the result has the following XML structure:

  • An Items element at the root of the document
  • Each line in the flat file has an element named the same as the Control Field Value for that line (or ‘row’ if no Control Field Value is defined)
  • Each field in the row is a child element of the Control Field Value element

For example, if a flat file has ‘SHIP’ and ‘PCKG’ lines, then the XML output would look similar to this:

<Items>
  <SHIP>
    <ShipmentId>12B992</ShipmentId>
    <Date>20200228</Date>
    <ShipTo>14 Wallaby Way</ShipTo>
  </SHIP>
  <PCKG>
    <ShipmentId>12B992</ShipmentId>
    <ItemName>Goggles</ItemName>
    <ItemWeight>3.98</ItemWeight>
  </PCKG>
  <PCKG>
    <ShipmentId>12B992</ShipmentId>
    <ItemName>Fins</ItemName>
    <ItemWeight>1.07</ItemWeight>
  </PCKG>
</Items>

To convert an XML file into a flat file, the XML input must match the above structure (this includes the restriction that the field names must match the defined fields in the connector configuration).

When converting XML into a flat file, the connector will create a new line in the resulting flat file for each row element in the input XML. For each child of the row element, the connector will match the child to a Field Name in the connector configuration, and place the value for that element at the appropriate Field Index.