CData Arc uses the concept of “messages” to describe data passing through automated workflows. Messages consist primarily of file payload data (the data CData Arc is processing) and metadata (information that CData Arc uses to track the flow of data through the application).
Basic Message Structure
Basic messages have two parts: (1) a set of name-value pairs as headers, and (2) the message payload. Together, this message format is stored in a ‘.eml’ file type that can be opened by any standard text editor or email client.
Message headers help CData Arc track the progress of data through the application. Headers include a unique Message ID (which helps CData Arc know the full lifecycle of a message, even if the filename is changed), timestamps from when connectors processed the message, any errors that might have occurred during processing, and other metadata.
Headers are listed in headerName: headerValue syntax at the top of the message, delimited by line breaks. Clicking on a message within CData Arc (e.g. within the Input our Output tab of a connector panel) will show the headers associated with the message, and allow for downloading the file content of the message.
Messages are also used for miscellaneous values that are helpful for CData Arc to know within a flow. For example, when downloading files from a subfolder on a remote FTP server, CData Arc uses a header to track the folder path to the message in case this folder path needs to be recreated on the local system.
The message payload is the actual file data being processed by the application. This is the data that is received/downloaded from a remote source, manipulated by transformation connectors, etc. While the message headers are primarily used by CData Arc for tracking messages, the message payload contains the data that users care primarily about.
Message payloads are separated from message headers by two line breaks. Opening an CData Arc message in a text editor or email client will display the message payload in plain text.
Messages Within a Flow
While CData Arc internally uses message headers for tracking and understanding the data processed by the application, it hides these details from users unless the message is inspected. For example, CData Arc uses an internal Message ID to identify a message, but will display a public filename in Input/Output tabs, transaction logs, etc.
To examine the headers on a message, simply click on the displayed filename to view further message information.
Message headers are added to files as soon as the first connector processes the file. Message headers are stripped from the message at the end of the flow (after the last connector in the flow has processed the message). In other words, message headers and the ‘.eml’ format is only relevant while file data is in the midst of being processed in an CData Arc flow.
Whenever a message is processed by a connector, CData Arc generates a transaction log for that processing. These logs can be accessed via the Transaction Log in the Status page, or by clicking on a message filename within a connector panel and selecting Download Logs.
Batch Groups and Batch Messages
Messages can contain multiple payloads, in which case they are considered “batch groups.” Each individual payload within a batch group is considered a “batch message.”
Batch groups are MIME-format files where each MIME part is a separate batch message. The batch group maintains metadata about the batch using the same header scheme that basic messages use. These headers track the processing of the batch as a whole, rather than individual parts of the batch.
Each batch message within a batch group is a MIME part containing the file payload data processed by the application. Each batch message contains metadata associated with the payload (but not the batch group) within the MIME part. Thus, batch messages have multiple sets of metadata: one set of headers for tracking the batch group as a whole, and a separate set of headers for each individual MIME part (each batch message) within the batch.