The CSV Connector can convert Comma-Separated Values (CSV) files into XML and also generate CSV files from XML.
XML is the primary format that CData Arc uses to manipulate data within a flow. Thus it is useful to convert CSV files into XML as a staging step for further processing within the flow, or to convert XML to a CSV file after the XML has been manipulated. Both of these operations can be accomplished with the CSV connector.
This section contains all of the configurable connector properties.
Settings related to the core operation of the connector.
- Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
- Connector Description An optional field to provide free-form description of the connector and its role in the flow.
- First line is header information Whether the CSV file contains a row of headers providing names or context to the values in the file.
- Record Name The name of elements representing a row in the CSV file when converting to XML. Please see Converting CSV to XML for more details.
Settings not included in the previous categories.
- Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders can help keep logs organized and improve performance.
- Log Messages Whether the log entry for a processed file will include a copy of the file itself.
- Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.
Settings for specific use cases.
- Other Settings Allows configuration of hidden connector settings in a semicolon-separated list, like
setting1=value1;setting2=value2. Normal connector use cases and functionality should not require use of these settings.
Settings related to the automatic processing of files by the connector.
- Send Whether messages arriving at the connector will automatically be processed.
Converting CSV to XML
When a CSV file is transformed into XML, the resulting XML has the following structure:
Each row (record) in the original file becomes a child of the root element ‘Items.’ The name of all record elements is determined by the Record Name setting option. Each record element then has child elements corresponding to the values in each row of the input file.
Some CSV files include a line of header information that provides context to the values in the file.
When the First line is header information setting is enabled, this header line will be parsed, and the parsed headers will be used as the element names for the value elements (the children of the record elements).
Otherwise, the value elements will be given generic names such as field_0, field_1, etc.
Converting XML to CSV
To convert XML to a CSV file, the input XML must have a ‘flat’ structure. This means that, disregarding the root element, the depth of the XML structure is 2. For example:
The XML will be interpreted as follows: Children of the root element will be treated as records (rows) within the resulting file, and children of each record element will be treated as the values in each row. If the First line is header information option is enabled, a header row will be inserted into the resulting CSV file with the names of each value element to provide context to the values. In the above example, this header row would consist of title, year, and runtime.
CSV Transformation: Using the XML Map Connector
Many data transformation flows use the CSV Connector in conjunction with the XML Map Connector.
Often, data enters an Arc flow in CSV format and should exit the flow in some other format (e.g. a database insert, an EDI file, an insert into a CRM or ERP data source, etc), or vice versa. Arc uses a single streamlined approach to these data transformation requirements:
- Model the input format as XML
- Model the output format as XML
- Use the XML Map Connector to map between the input XML and the output XML
Thus the CSV Connector is commonly adjacent to an XML Map Connector in the flow:
When CSV files are the input to the flow, the CSV Connector converts a CSV file to XML and then passes that XML off to the XML Map Connector to be transformed; when CSV files are the output from the flow, the CSV Connector receives XML from the XML Map Connector and converts it into a CSV file.
The CSV Connector includes an Upload Test File feature to simplify the process of mapping the XML that represents a CSV file.
Upload Test File
An XML Map Connector requires a sample XML structure for both the Source/Input for the mapping and the Destination/Output for the mapping. The Upload Test File feature makes it easy to use the CSV Connector to generate a Source or Destination XML template.
In the Input tab of the CSV Connector, the More dropdown button includes the Upload Test File option. Select this option and browse to a sample CSV file on disk to instruct the connector to generate an internal XML model of this sample file.
Then, when an XML Map Connector is connected to this CSV Connector in the flow (and the flow changes are saved), the XML Map Connector can detect this internal XML model and use it as a Source File (if the CSV Connector is before the XML Map Connector in the flow) or a Destination File (if the CSV Connector is after the XML Map Connector in the flow).
Note that the structure of this Test File should be representative of all future files. In other words, all of the CSV files processed by the CSV Connector (and then the XML Map Connector) should have the same columns as the selected Test File. It may be necessary to set up multiple CSV Connectors and multiple XML Map Connectors to handle distinct CSV structures.
In addition to the Operations provided with CData Arc, connectors may provide operations that extend functionality into ArcScript. Operations specific to the functionality of the CSV Connector are listed below.
Loops over every record in a specified CSV file or string.
- file: The path to the CSV file.
- data: If the CSV data exists as a string rather than stored in a CSV file use this parameter instead of file.
- columns: The comma-separate list of columns to include in the output (if unspecified, all columns will be included).
- requireheader: By default, the first row of data is interpreted as column headers; pass false to this parameter to use generic column names (e.g. c1, c2, c3).
Any script within a csvListRecords operation will execute multiple times: once for each record/row found in the input CSV file/data. Within the operation, individual CSV values are accessible using the csv formatter. This formatter takes a column name as a parameter, and outputs the value in that column for the current record.
For example, imagine the CSV input data contains a set of items purchased in an order, and the name of the item is held in the ‘ItemName’ column. The following script generates XML containing each ‘ItemName’ value in an <Item> element:
<ItemList> <arc:call op="csvListRecords?file=myFile.csv"> <Item>[csv('ItemName')]</Item> </arc:call> </ItemList>