Excel Connector
Version 24.2.9039
Version 24.2.9039
Excel Connector
The Excel connector converts data between Excel sheets (.xlsx) and XML files.
Overview
Excel connectors can be configured in two modes: Table and Template. These modes are described below.
Table Mode
The default mode for Excel connectors is Table. This mode only supports converting files from XLSX to XML. To convert from XML to XSLX, use Template mode.
In Table mode, the connector treats input XLSX files exactly like CSV files. The resulting output XML file has the following structure, where each row (record) in the original file becomes a child of the root element Items:
<Items>
<Record>
<field_0></field_0>
<field_1></field_1>
<field_2></field_2>
</Record>
</Items>
Template Mode
In Template mode, the connector uses a template file to perform conversions. This template file matches the output file format:
- When converting from XML to Excel, the template file is an Excel file.
- When converting from Excel to XML, the template file is an XML file.
These template files use ArcScript to dynamically populate the output files with data from the input files. For more information, see the Templates section.
Connector Configuration
This section contains all of the configurable connector properties.
Settings Tab
Configuration
Settings related to the core configuration of the connector.
- Connector Id The static, unique identifier for the connector.
- Connector Type Displays the connector name and a description of what it does.
- Connector Description An optional field to provide a free-form description of the connector and its role in the flow.
- Translation Mode Specify which method (Table or Template) to use to translate the input file to the output format. See the Overview for more information.
- Column headers (Table mode) Check this to have the translation use the values in the first row of the XLSX file as the element names of the child elements. If unchecked, the value elements are given generic names such as field_0, field_1, etc.
- Template File (Template mode) The file that functions as the output template. Data is dynamically added to the file based on the scripting in the template. See the Templates section for details.
Advanced Settings
Settings not included in the previous categories.
- Start Row (Table mode) The first row in the sheet to use for the output table.
- Start Column (Table mode) The first column in the sheet to use for the output table.
- End Column (Table mode) The last column in the sheet to use for the output table. If not set, the column preceding the first empty cell on the first row is detected as the end of the sheet.
- DateTime Format (Table mode) The format to use for DateTime values in the Excel document.
- Number Format (Table mode) The format to use for numbers in the Excel document. Set this if you want all numbers in the document to be handled in the same way.
- Local File Scheme A scheme for assigning filenames to messages that are output by the connector. You can use macros in your filenames dynamically to include information such as identifiers and timestamps. For more information, see Macros.
Message
Message settings determine how the connector searches for messages and manages them after processing. You can save messages to your Sent folder or you can group them based on a Sent folder scheme, as described below.
- Save to Sent Folder Check this to copy files processed by the connector to the Sent folder for the connector.
- Sent Folder Scheme Instructs the connector to group files in the Sent folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all sent files for the week in that folder. The blank setting instructs the connector to save all files directly in the Sent folder. For connectors that process many transactions, using subfolders can help keep files organized and improve performance.
Logging
- Log Level The verbosity of logs generated by the connector. When you request support, set this to Debug.
- Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders helps keep logs organized and improves performance.
- Log Messages Check this to have the log entry for a processed file include a copy of the file itself. If you disable this, you might not be able to download a copy of the file from the Input or Output tabs.
Miscellaneous
Miscellaneous settings are for specific use cases.
- Other Settings Enables you to configure hidden connector settings in a semicolon-separated list (for example,
setting1=value1;setting2=value2
). Normal connector use cases and functionality should not require the use of these settings.
Automation Tab
Settings related to the automatic processing of files by the connector.
Automation Settings
- Send Whether messages arriving at the connector are automatically processed.
Performance
Settings related to the allocation of resources to the connector.
- Max Workers The maximum number of worker threads consumed from the threadpool to process files on this connector. If set, this overrides the default setting on the Settings > Automation page.
- Max Files The maximum number of files sent by each thread assigned to the connector. If set, this overrides the default setting on the Settings > Automation page.
Alerts Tab
Settings related to configuring alerts and Service Level Agreements (SLAs).
Connector Email Settings
Before you can execute SLAs, you need to set up email alerts for notifications. Clicking Configure Alerts opens a new browser window to the Settings page where you can set up system-wide alerts. See Alerts for more information.
Service Level Agreement (SLA) Settings
SLAs enable you to configure the volume you expect connectors in your flow to send or receive, and to set the time frame in which you expect that volume to be met. CData Arc sends emails to warn the user when an SLA is not met, and marks the SLA as At Risk, which means that if the SLA is not met soon, it will be marked as Violated. This gives the user an opportunity to step in and determine the reasons the SLA is not being met, and to take appropriate actions. If the SLA is still not met at the end of the at-risk time period, the SLA is marked as violated, and the user is notified again.
To define an SLA, click Add Expected Volume Criteria.
- If your connector has separate send and receive actions, use the radio buttons to specify which direction the SLA pertains to.
- Set Expect at least to the minimum number of transactions (the volume) you expect to be processed, then use the Every fields to specify the time frame.
- By default, the SLA is in effect every day. To change that, uncheck Everyday then check the boxes for the days of the week you want.
- Use And set status to ‘At Risk’ to indicate when the SLA should be marked as at risk.
- By default, notifications are not sent until an SLA is in violation. To change that, check Send an ‘At Risk’ notification.
The following example shows an SLA configured for a connector that expects to receive 1000 files every day Monday-Friday. An at-risk notification is sent 1 hour before the end of the time period if the 1000 files have not been received.
Templates
XML to Excel
The following snippet from an XML file contains a list of elements under the /Items/Orders
path.
<Items>
<Orders>
<OrderNo>PO0012345</OrderNo>
<Customer>Teddy</Customer>
<Date>04/19/2023</Date>
<SubTotal>23.98</SubTotal>
<Items>
<Name>Teddy Bear</Name>
<Cost>14.99</Cost>
<Desc>Brown</Desc>
</Items>
<Items>
<Name>Truck</Name>
<Cost>8.99</Cost>
<Desc>Red</Desc>
</Items>
</Orders>
</Items>
To convert this XML file to XSLX, you must create an Excel template. The image below shows an example Excel template that uses xmlDOMSearch and xpath to loop through the elements of the XML file:
Every Excel template must contain the following elements, which are shown in the example above:
- Static column headers. In the example above, this is Order Detail.
- Scripting in Excel notes. This scripting should use the xmlDOMSearch operation to loop through the input XML at a specified XPath. In the example above, this is contained in the arc:call operation.
- Scripting in Excel cells. This scripting should use the xpath formatters to read values from the XML at a given xpath. This xpath is relative to the xpath specified in the xmlDOMSearch operation.
The details of the xmlDOMSearch operation and xpath formatters in this example are described below.
xmlDOMSearch
The xmlDOMSearch requires two parameters:
URI parameter
The URI is the resource path to the XML file to parse. The [filepath] attribute resolves to the input XML file to the connector, and the URI should almost always be set to this value. In the above example, the filepath is URL-encoded with the urlencode formatter to ensure that special characters in the filepath do not prevent the connector from reading the file: [filepath | urlencode]
xpath parameter
The xpath is the XML path to loop over in the document. The operation loops for each occurrence of the specified xpath. For example, with the xpath /Items/Orders, each Orders element that is a child of the root Items element causes a new set of output in the resulting Excel file.
xpath formatters
The Excel notes with the xmlDOMSearch operation surround a block of cells that reference the input XML via the xpath formatter. The xpath formatter reads values from the input XML at the specified xpath.
Note: this xpath is relative to the path provided as a parameter to the xmlDOMSearch operation.
In the above example, the first cell is populated with: [xpath("OrderNo")]. Since this xpath is relative, the cell is populated with the value from the following path in the input XML: /Items/Orders/OrderNo.
If the xmlDOMSearch operation loops more than once—that is, if more than one instance of the operation’s xpath parameter is found—then the block of cells between the Excel notes is repeated. The new cells are added vertically, so in the above example the second OrderNo cell would be directly below the first Notes cell.
Excel to XML
To convert from XSLX to an XML file, you must create a custom XML template using ArcScript commands. Since XML does not support the same scripting functionality as Excel files, the structure and formatting for converting from XSLX to XML is different than from XML to Excel.
Note: Due to the wide range of potential input and output structures, there is no universal solution for every conversion need. The example template shown below is just one of many potential solutions. You need to create a template based on your own needs.
This example template uses a combination of standard XML syntax and Arc Script to read, process, and convert data from an input Excel file.
The arc:set
lines below specify the location of the file and the Excel version used to create the file.
<!-- Read Excel -->
<arc:set attr="xml.file" value="[FilePath]" />
<arc:set attr="xml.version" value="2007" />
The sections below specify the conversion parameters and map the headers into standard XML formatting.
The parameter [_value | toalphanum | replace(' ','')]
indicates that the arc:set
command should parse header names from the Excel file, convert them to alphanumeric format, eliminate blank spaces, and assign them as headers in the XML file.
<!-- Always use first sheet-->
<arc:call op="excelListSheets" in ="xml" out="sheet">
<arc:set attr="xml.sheet" value="[sheet.sheet]" />
<arc:break />
</arc:call>
<!-- read the headers into an array-->
<arc:set attr="xml.map:headers" value="A1:*1" />
<arc:call op="excelGet" in="xml" out="header">
<arc:enum attr="header.headers#">
<arc:set attr="data.headernames#" value="[_value | toalphanum | replace(' ','')]" />
</arc:enum>
</arc:call>
<arc:set attr="_log.info" value="[data.headernames#1]" />
<arc:enum range="A..Z">
<arc:set attr="tmp.cells#" value="[_value]" />
</arc:enum>
The sections below process the remaining data and build the data columns according to the specified structure.
<!-- read the remaining cells -->
<arc:enum attr="data.headernames#">
<arc:set attr="xml.map:[_value]" value="[tmp.cells#[_index]]2:[tmp.cells#[_index]]*" />
</arc:enum>
<!-- build the columns -->
<arc:call op="excelGet" in="xml" out="cols">
<arc:map from="cols" to="data" map="*=*" />
</arc:call>
<Items>
<arc:enum attr="data.[data.headernames#1]#"><arc:set attr="tmp.colIndex" value="[_index]" />
<Record>
<arc:enum attr="data.headernames#">
<[_value]>[data.[_value]#[tmp.colIndex] | def]</[_value]>
</arc:enum>
</Record>
</arc:enum>
</Items>
Macros
Using macros in file naming strategies can enhance organizational efficiency and contextual understanding of data. By incorporating macros into filenames, you can dynamically include relevant information such as identifiers, timestamps, and header information, providing valuable context to each file. This helps ensure that filenames reflect details important to your organization.
CData Arc supports these macros, which all use the following syntax: %Macro%
.
Macro | Description |
---|---|
ConnectorID | Evaluates to the ConnectorID of the connector. |
Ext | Evaluates to the file extension of the file currently being processed by the connector. |
Filename | Evaluates to the filename (extension included) of the file currently being processed by the connector. |
FilenameNoExt | Evaluates to the filename (without the extension) of the file currently being processed by the connector. |
MessageId | Evaluates to the MessageId of the message being output by the connector. |
RegexFilename:pattern | Applies a RegEx pattern to the filename of the file currently being processed by the connector. |
Header:headername | Evaluates to the value of a targeted header (headername ) on the current message being processed by the connector. |
LongDate | Evaluates to the current datetime of the system in long-handed format (for example, Wednesday, January 24, 2024). |
ShortDate | Evaluates to the current datetime of the system in a yyyy-MM-dd format (for example, 2024-01-24). |
DateFormat:format | Evaluates to the current datetime of the system in the specified format (format ). See Sample Date Formats for the available datetime formats |
Vault:vaultitem | Evaluates to the value of the specified vault item. |
Examples
Some macros, such as %Ext% and %ShortDate%, do not require an argument, but others do. All macros that take an argument use the following syntax: %Macro:argument%
Here are some examples of the macros that take an argument:
- %Header:headername%: Where
headername
is the name of a header on a message. - %Header:mycustomheader% resolves to the value of the
mycustomheader
header set on the input message. - %Header:ponum% resolves to the value of the
ponum
header set on the input message. - %RegexFilename:pattern%: Where
pattern
is a regex pattern. For example,%RegexFilename:^([\w][A-Za-z]+)%
matches and resolves to the first word in the filename and is case insensitive (test_file.xml
resolves totest
). - %Vault:vaultitem%: Where
vaultitem
is the name of an item in the vault. For example,%Vault:companyname%
resolves to the value of thecompanyname
item stored in the vault. - %DateFormat:format%: Where
format
is an accepted date format (see Sample Date Formats for details). For example,%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%
resolves to the date and timestamp on the file.
You can also create more sophisticated macros, as shown in the following examples:
- Combining multiple macros in one filename:
%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%%EXT%
- Including text outside of the macro:
MyFile_%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%
- Including text within the macro:
%DateFormat:'DateProcessed-'yyyy-MM-dd_'TimeProcessed-'HH-mm-ss%
Excel Operations
In addition to the Operations provided with Arc, connectors can provide operations that extend functionality into ArcScript.
These connector operations can be called just like any other ArcScript operation, except for two details:
- They must be called through the
connector.rsc
endpoint. - They must include an auth token.
For example, calling a connector operation using both of these rules might look something like this:
<arc:set attr="in.myInput" value="myvalue" />
<arc:call op="connector.rsc/opName" authtoken="admin:1j9P8v8b9K0x6g5R5t7k" in="in" out="out">
<!-- handle output from the op here -->
</arc:call>
Operations specific to the functionality of the Excel connector are listed below.
excelClose
Close an Excel connection.
Optional Parameters
- handle: The handle for the Excel file.
Output Attributes
- success: True if the connection is closed successfully.
excelGet
Queries the specified Excel worksheet.
Required Parameters
- sheet: The name of the Excel worksheet.
Optional Parameters
- version: The version of Excel you are using. The default is AUTO. You only need to choose another version if you are using a legacy Excel version.
- file: The path to the Excel workbook.
- handle: The handle for the Excel file.
- map:*: This set of inputs contains a mapping of the attribute name and the name of the cell whose value is to be retrieved from the spreadsheet. For example, the attribute name map:MyValue which has a value of C1 pushes an attribute named MyValue with the value found in the cell at C1 in the sheet. You can specify a range of cell names to retrieve a range of cell values.
Output Attributes
- *: Depends on the content of the sheet and the query specified. If column headers are present they are used to name the output attributes.
excelListSheets
Lists the worksheets in a specified Excel workbook.
Optional Parameters
- version: The version of Excel you are using. The default is AUTO. You only need to choose another version if you are using a legacy Excel version.
- file: The path to the Excel workbook.
- handle: The handle for the Excel file.
Output Attributes
- isHidden: Returns true if the sheet is hidden from view in Excel.
- sheet: The name of the Excel worksheet. Note that the name ends with $. This is not required when specifying worksheet names to other operations.
excelOpen
Open an existing Excel workbook.
Required Parameters
- file: The path to the Excel workbook.
Optional Parameters
- version: The version of Excel you are using. The default is AUTO. You only need to choose another version if you are using a legacy Excel version.
Output Attributes
- handle: The handle which is used to execute other operations.