Flat File Connector

Version 22.0.8473


Flat File Connector


Flat File Connectors convert flat file formats into XML and vice versa.

Overview

Each Flat File Connector is configured with a specific flat file format to convert to XML or to generate out of XML. The Flat File Connector has two primary modes:

  • Position Delimited
  • Character Delimited

For Position Delimited flat files, the connector is configured with an arbitrary set of field names, indices (i.e. positions), and lengths indicating where data appears in each line of the flat file.

For Character Delimited flat files, the connector is configured with the character that separates field values in the flat file.

More details on configuring the flat file format can be found in the Defining the Flat File Format section.

The Flat File Connector supports defining multiple different types of lines in the flat file. For example, a flat file may have a ‘header’ line representing Purchase Order data, and multiple ‘item’ lines representing the Line Items in the order.

The key to defining multiple line types is to specify the Control Field; the control field value is what determines the type of a specific line in the flat file data (for example, a header line might have a control field value of ‘HEAD’ while an item line has a control field value of ‘ITEM’). More details on configuring multiple line types can be found in the Multiple Line Types section.

After the flat file format is configured, the connector will convert files that match this format into XML. The resulting XML structure is explained in the XML Format section. The Flat File Connector can also translates XML that fits this structure into a flat file according to the defined flat file format.

Some flat files have implied hierarchical relationships between different line types. For information on preserving these hierarchical relationships when converting the flat file to XML, please see the Multi-Line Hierarchy section.

Connector Configuration

Settings Tab

Configuration

Settings related to the core configuration of the connector.

  • Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
  • Connector Description An optional field to provide free-form description of the connector and its role in the flow.
  • File Type Position Delimited - the fields in flat file appear at a specific position within each line.
    Character Delimited - the fields in the flat file are separated by a specific character defined by the Delimiter field.
  • Delimiter If File Type is set to Character Delimited, this is the character that separates individual fields in the flat file.

Control Field: Position Delimited

Settings related to the Control Field, which determines the different types of lines defined in the flat file format, when the connector is set to Character Delimited.

  • Multi-line Mode Whether the flat file format includes multiple types of lines. More information can be found in the Multiple Line Types section.
  • Start Index If Multi-line Mode is enabled, this value is the index in the line where the Control Field begins. For example, if the first field in a line defines the type of line (i.e. the first field is the control field), then the Start Index would be 0.
  • Control Field Length (Optional) If Multi-line Mode is enabled, this value defines the number of characters to read from the Start Index to get the value of the Control Field.
  • Column Headers Present If Multi-line Mode is disabled, this determines whether the first line of a flat file should be interpreted as column headers (i.e. the names of each field rather than actual data). Enabling this setting causes the connector to generate a header row when converting from XML to flat file.

Control Field: Character Delimited

Settings related to the Control Field, which determines the different types of lines defined in the flat file format, when the connector is set to Position Delimited.

  • Multi-line Mode Whether the flat file format includes multiple types of lines. More information can be found in the Multiple Line Types section.
  • Field Index If Multi-line Mode is enabled, this value is the index of the Control Field in the line (indices start at 0). For example, if the second field of a line defines the type of line (i.e. the second field is the control field), then the Field Index would be 1.
  • Generate Field/Line Type Names If Multi-line Mode is enabled, this setting provides the option to not specify the names and indexes of the fields and line types within the flat file. When this setting is enabled, the connector will auto-generate XML element names for fields and line types that are not explicitly defined in the Line Types section below.
  • Column Headers Present If Multi-line Mode is disabled, this determines whether the first line of a flat file should be interpreted as column headers (i.e. the names of each field rather than actual data). Enabling this setting causes the connector to generate a header row when converting from XML to flat file.
  • Generate Field Names If Multi-line Mode is disabled, this determines whether the connector will generate generic field names. Disable this setting if you want to manually specify the field names in the Line Type section.

Line Type

This section allows for defining the field names and positions (if Line Type is Position Delimited) in the flat file format. More details on defining the flat file format can be found in the Defining the Flat File Format section.

If Multi-line Mode is enabled, clicking the Add Line Type button allows you to define multiple line types. Each line type has a Control Field Value, which is used to identify the line type. For example, a header line may have the control field value of HEAD, while an item line has the control field value of ITEM.

If File Type is set to Character Delimited, Multi-line Mode is enabled, and Generate Field/Line Type Names is enabled, it is not necessary to provide names and indexes for all of the fields or line types present in the flat file. In this case, the connector will auto-generate XML element names for any unspecified fields or line types.

Advanced

  • Padding Character When creating a flat file and the field value does not fill the entire field length, this character will be used to fill the rest of the field.
  • Invalid XML Name Prefix Some field names are not valid names for XML elements (for example, fields that begin with a number like ‘123ABC’), so a prefix must be applied to generate XML from the flat file. Similarly, when translating from XML to flat file, the connector will look for this prefix and remove it.
  • Line Separator Specifies the control character that the connector uses for line endings when writing files. Options are LF (default) and CRLF.
  • Local File Scheme A filemask for determining local file names as they are downloaded by the connector. Hover over the field for a full list of available macros.
  • Nest Line Types This setting is only relevant when converting flat files to XML, and when there are multiple line types in the flat file. If Mutli-line Mode is enabled, enabling this setting instructs the connector to add hierarchy to the resulting XML based on the control fields of the flat file lines (i.e. the fields that identify the type of line). More information on this automatic hierarchy can be found in Multi-Line Hierarchy section.
  • Pad on EOL By default, the connector will throw an error if it encounters an unexpected end-of-line. When enabled, this setting tells the connector to pad out the remainder of the line instead of throwing an error.
  • Force Cell Delimiter If File Type is set to Character Delimited, enabling this setting instructs the connector to wrap each cell value in cell delimiters ("). Disabling this setting instructs the connector to only wrap the cell’s value in cell delimiters if the specified character delimiter is present in the cell’s value.
  • Processing Delay The amount of time (in seconds) that the connector will wait to process files in the Input folder.

Message

Settings that determine how the connector will search for messages and handle them after processing.

  • Save to Sent Folder A toggle that instructs the connector to keep a copy of sent messages in the Sent folder.
  • Sent Folder Scheme Instructs the connector to group files in the Sent folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all sent files for the week in that folder. The blank setting tells the connector to save all files directly in the Sent folder. For connectors that process many transactions, using subfolders can help keep files organized and improve performance.

Logging

Settings that govern the creation and storage of logs.

  • Log Level Specifies the type of information to log in the connector’s Logs directory:
    • None — Does not create any logs.
    • Error — Creates logs only when the connector encounters an error.
    • Warning — Creates logs only when the connector issues a warning.
    • Info — Logs general information about the workflow, including any errors and warnings (if applicable).
    • Debug — Logs detailed debugging information for both successful and failed workflows.
    • Trace — Logs detailed trace information for both successful and failed workflows.

    Please note that Debug and Trace may log sensitive information including message contents and SSL certificates. Although connection properties (such as passwords) are masked, please review logs of this level for sensitive information before sharing them outside of your organization.

  • Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders can help keep logs organized and improve performance.
  • Log Messages A toggle that instructs the connector to save a copy of the most recent message in the Logs directory. Note that the connector only keeps one message per subfolder, and the connector overrides the previously-saved message when it runs again.

Miscellaneous

Settings for specific use cases.

  • Other Settings Allows configuration of hidden connector settings in a semicolon-separated list, like setting1=value1;setting2=value2. Normal connector use cases and functionality should not require use of these settings.

Automation Tab

  • Send Whether files arriving at the connector will automatically be processed.

Performance

Settings related to the allocation of resources to the connector.

  • Max Workers The maximum number of worker threads that will be consumed from the threadpool to process files on this connector. If set, overrides the default setting from the Profile tab.
  • Max Files The maximum number of files that will be processed by the connector each time worker threads are assigned to the connector. If set, overrides the default setting from the Profile tab.

Defining the Flat File Format

The first step in configuring a Flat File Connector is defining the format of the flat file. This section describes formats that have a single line type, or in other words, each line in the flat file will have the same set of fields. For flat files with multiple different line types, please see the Multiple Line Types section.

Character Delimited Single-line Format

For Character Delimited flat files, defining the format is simple: specify the character that separates different fields in the flat file via the Delimiter property.

The Column headers present field indicates whether the first line of the flat file is a header line; in other words, it contains the names of fields rather than actual data. If these column headers are present, the connector will use the header names as the names of XML elements in the resulting translated XML. Similarly, the connector will use the XML element names to generate a header row when translating from XML to flat file.

If column headers are not present, the connector supports manually specifying the names of each field by adding the fields in the Line Type section. These field names are applied in index order, meaning that the first entry in Line Type will be the name of the first field in the flat file line, and so on.

The connector can also auto-generate generic field names by enabling Use auto generated field names.

Position Delimited Single-line Format

For Position Delimited flat files, defining the format requires specifying the position of each field in the format. The Line Type section in the connector settings provides an interface for adding an arbitrary number of fields that are present in each line of the flat file. Each field must be identified by a name and the position in the file it appears.

The Column headers present field indicates whether the first line of the flat file is a header line; in other words, it contains the names of fields rather than actual data. The field names must still be configured in the Line Type section, and this setting simply helps ensure that a header row is not interpreted as real data.

Multiple Line Types

If the flat file format contains multiple types of lines, the Multi-line Mode property should be enabled. The field in the flat file that identifies the line type is known as the Control Field.

Character Delimited Multi-line Format

When the File Type is Character Delimited, the Field Index setting determines where the Control Field appears in each line of the flat file. This index starts at 0, meaning that if the Control Field is the 5th value in the line, the Field Index should be 4.

For each possible value that might appear in the Control Field, define a new line type by clicking the Add Line Type button within the Line Types section of the connector settings. The value that identifies the line type should be set in the Control Field Value for that line.

Once each possible line type has been added and identified via a specific Control Field Value, the fields that will appear in each line type should be specified in index order.

If Generate Field/Line Type Names is enabled, it is not necessary to provide names and indexes for all of the fields or line types present in the flat file (only the Control Field is required). In this case, the connector will auto-generate XML element names for any unspecified fields or line types.

Position Delimited Multi-line Format

When the File Type is Position Delimited, the Start Index setting determines the position where the Control Field appears (begins) in each line of the flat file. This index starts at 0, meaning that if the Control Field begins at the 15th character in the line, the Field Index should be 14.

For each possible value that might appear in the Control Field, define a new line type by clicking the Add Line Type button. The value that identifies the line type should be set in the Control Field Value for that line.

Once each possible line type has been added and identified via a specific Control Field Value, the fields that will appear in each line type should be specified along with the position in the line that they appear (begin).

Multi-line Example

For example, say that a flat file contains two types of lines, a shipment line type and a package line type. The shipment line contains information on the date, time, and addresses of shipment, and the package line type contains information on the items being shipped.

The shipment line type may have a Control Field Value of ‘SHIP’, and the package line type may have a Control Field Value of ‘PCKG’. The first field in each line is either ‘SHIP’ or ‘PCKG’ to indicate what type of line it is.

To handle this case, the Multi-line mode should be enabled, and the Field Index (or Start Index) should be set to 0 to indicate that the Control Field is the first field in the line. Then, there should be two line types configured in the Line Types section; one with a Control Field Value of ‘SHIP’, which contains each field in a shipment line (e.g. ShipDate, DeliveryDate, ShipToAddress, etc), and one with a Control Field Value of ‘PCGK’, which contains each field in a package line (e.g. ItemName, ItemWeight, etc).

XML Format

After a flat file is converted to XML, the result has the following XML structure:

  • An Items element at the root of the document
  • Each line in the flat file has an element named the same as the Control Field Value for that line (or ‘row’ if no Control Field Value is defined)
  • Each field in the row is a child element of the Control Field Value element

For example, if a flat file has ‘SHIP’ and ‘PCKG’ lines, then the XML output would look similar to this:

<Items>
  <SHIP>
    <ShipmentId>12B992</ShipmentId>
    <Date>20200228</Date>
    <ShipTo>14 Wallaby Way</ShipTo>
  </SHIP>
  <PCKG>
    <ShipmentId>12B992</ShipmentId>
    <ItemName>Goggles</ItemName>
    <ItemWeight>3.98</ItemWeight>
  </PCKG>
  <PCKG>
    <ShipmentId>12B992</ShipmentId>
    <ItemName>Fins</ItemName>
    <ItemWeight>1.07</ItemWeight>
  </PCKG>
</Items>

To convert an XML file into a flat file, the XML input must match the above structure (this includes the restriction that the field names must match the defined fields in the connector configuration).

When converting XML into a flat file, the connector will create a new line in the resulting flat file for each row element in the input XML. For each child of the row element, the connector will match the child to a Field Name in the connector configuration, and place the value for that element at the appropriate Field Index.

Multi-Line Hierarchy

Flat files with multiple line types often have an implied hierarchical relationship between the line types. For example, one line might contain data for a Purchase Order (e.g. customer name, order date, etc), while the following lines represent each line item in that order (e.g. item name, item quantity, etc). The line items “belong” to the order, creating a hierarchical relationship.

These relationships are sometimes referred to as “Master-Detail” relationships; meaning the first line type (Purchase Order) is the “master,” and the following lines (line items) are “details” about that master line. In XML, this is often represented as a “Parent-Child” relationship.

The Flat File Connector can preserve these hierarchical relationships when translating flat files to XML. If the Nest Line Types setting in the Advanced tab is enabled, the connector will automatically indent “detail” line types so that they are children of “master” line types.

The following is a simple example of translating a hierarchical flat file with Nest Line Types set to true. The section after the example will explain the logic the connector uses to determine the hierarchical relationships.

Multi-Line Hierarchy Example

Take as example input the following flat file:

M,CustomerA,05022020,true
D,itemA,4,2.09
D,itemB,9,15.23
M,CustomerB,05032020,false
D,itemC,1,5.99
D,itemE,1,3.99

There are two line types in this flat file, M and D:

  • M, or Master, contains information for an order: customer name, order date, and a “high value customer” flag.
  • D, or Detail, contains line item information for the preceeding order: item name, item quantity, item price.

Since the D line type “belongs” to the M line type in a hierarchical relationship, it may be desirable to preserve this hierarchy when translating the flat file to XML. To see what this looks like, take the following example output when the Nest Line Types setting is enabled and the above flat file is translated to XML:

<Items>
  <M>
    <CustomerName>CustomerA</CustomerName>
    <OrderDate>05022020</OrderDate>
    <HighValueCustomer>true</HighValueCustomer>
    <D>
      <ItemName>itemA</ItemName>
      <ItemQuantity>4</ItemQuantity>
      <ItemPrice>2.09</ItemPrice>
    </D>
    <D>
      <ItemName>itemB</ItemName>
      <ItemQuantity>9</ItemQuantity>
      <ItemPrice>15.23</ItemPrice>
    </D>
  </M>
  <M>
    <CustomerName>CustomerB</CustomerName>
    <OrderDate>05032020</OrderDate>
    <HighValueCustomer>false</HighValueCustomer>
    <D>
      <ItemName>itemC</ItemName>
      <ItemQuantity>1</ItemQuantity>
      <ItemPrice>5.99</ItemPrice>
    </D>
    <D>
      <ItemName>itemE</ItemName>
      <ItemQuantity>1</ItemQuantity>
      <ItemPrice>3.99</ItemPrice>
    </D>
  </M>
</Items>

Notice that the D records are children of the M records, which accurately reflects the hierarchy present in the source data.

Multi-Line Hierarchy Logic

The Flat File Connector uses the order of lines in the source flat file to determine the hierarchy between line types. When Nest Line Types is enabled, the connector follows the following logic:

  • The first line type the connector encounters is always treated as the top level of hierarchy (i.e. it is not indented in the resulting XML)
  • After the first line type, when the connector encounters a new line type it assumes this line type is one level of hierarchy below the previous line type (i.e. it “belongs” to the previous line and is therefore indented in the resulting XML)
  • When the connector encounters a line type it has encountered before, it returns to that line type’s level of hierarchy, closing all XML elements at hierarchy levels equal-to or below this level.

Thus, the source flat file must have lines in order according to the desired hierarchy.