CData Python Connector for XML

Build 23.0.8839

SELECT Execution

When a SELECT query is issued, the connector executes the GET method of the schema, which invokes the connector's built-in operations to process XML. In the GET method you have control over the request for data. The following procedures show several ways to use this: search the remote data, server-side, with SELECT WHERE, LIMIT the results returned by the server, or implement paging.

Query Processing

By default, the connector will process the query client-side, in memory, so the XPath and URI connection properties are all you need to set to execute any SELECT statement.

The connector can also offload supported queries to the server while processing the rest of the query client side. For example, a server-side search filters the data so that a smaller dataset can be processed client-side. See the documentation for the SupportEnhancedSQL property for more information on the Query Processing feature.

Execute Selects to XML

The following steps result in a script that allows you to execute SQL-92 queries. The queries are processed client side. Before invoking a data processing operation, you will need to provide the URI and, optionally, the XPath. Use the api:set keyword to declare these attributes.

  1. Set the URI attribute to a local file or an HTTP-accessible address.

    <api:set  attr="uri"                      value="NorthwindOData.xml" /> 

  2. If needed, set the XPath attribute to the XPath of the data that constitutes an individual row. By default, the connector will scan the document to detect the rows (see Parsing Hierarchical Data).

    <api:set  attr="XPath" value="/feed/entry/" /> 

  3. Invoke the operation in the GET method. Inside the script block, use the api:push keyword to invoke the operation. Specify the operation with the op parameter. This keyword pushes the results of processing to the schema's output.

    <api:script method="GET">
      <api:set attr="method" value="GET"/>
      <api:set attr="uri" value="[uri]?$format=atom"/>
      <api:call op="xmlproviderGet">
        <api:push />
      </api:call>
    </api:script>

You can extend the resulting script to add support for processing requests server side.

Process SELECT WHERE on the Server

The following sections show how to translate a SELECT WHERE statement into a search request to XML APIs. The procedure uses the following statement:

SELECT * 
FROM <table> 
WHERE modifedAt < '2017-10-10' AND modifedAt > '2017-09-01'

If this filter is supported on the server via query parameters, you can use the other:filter property of the api:info column definition to specify the desired mapping. For the above query, we will use this property to map the modifiedAt < '<date>' filter to the query parameter that returns results that were modifed before a given date, and the modifedAt > '<date>' filter to the query parameter that filters results that were modifed after. other:filter is specified as follows:

  • other:filter is a semicolon separated list of <parameter name>:<operator list>. <parameter name> is the name of the query parameter and <operator list> is a comma-separated list of operators used for the mapping. Valid operators are <, <=, =, >, >=, and LIKE.

To perform this mapping, we would use the following markup for the modifedAt column definition:

    <attr name="modifiedAt"           xs:type="datetime" readonly="false"   other:xPath="content/properties/modifiedAt" other:filter="modifiedBefore:<;modifiedSince:>" />
    

This query results in the following request:

[url]?modifedBefore=2017-10-10&modifedSince=2017-09-01

If your API filter is not passed in a query parameter, you will need to pass it in the script. For example, consider an API that filters by name by querying the /persons/{name}/data endpoint.

SELECT *
FROM Persons 
WHERE (Name = 'Fran Wilson') 

In the GET method of the schema, use the attributes of the _input item, one of the Items in API Script, to access the search criteria and build the HTTP data retrieval request. The corresponding script below builds the request. The api:check element is useful for checking the existence of an attribute before attempting to access its value. A variety of Value Formatters are available to do transformations, like URL-encoding a string.

<api:script method="GET">
  <api:check attr="_input.Name">
    <api:set attr="uri" value="[uri]/[_input.name|urlencode]/data"/>
  </api:check>
</api:script>

Searching with Pseudo Columns

You can specify a pseudo column in the WHERE clause to build search criteria using inputs other than the columns returned in the results.

For example, in the Weather Underground API, you can return a forecast for a specified location. The Location itself is not part of the forecast data; it is specified in the request URI, as in the request below.

http://api.wunderground.com/api/{MyAPIKey}/hourly/q/{MyLocation}.xml 

Below is an example forecast query expressed in SQL:

SELECT * FROM Hourly WHERE Location="27516"

Follow the steps below to add a Location pseudo column and implement the preceding query:

  1. Add a Location input parameter to the column definitions in the api:info block. (The Location will be required; the connector should return an error if a Location is not specified.)
    <api:info>
      ...
      <input  name="Location"                 required="true"/>
    </api:info>
  2. Reference the _input item's Location attribute to build the URI:
    <api:set attr='uri' value="http://api.wunderground.com/api/[_connection.APIKey]/hourly/q/[_input.Location].xml"/>
  3. Invoke the operation to make the request and process the response:
    <api:script method="GET" >
      <api:push op="xmlproviderGet"/>
    </api:script> 

Implement Paging

To support automatic paging, add the 'Rows@Next' input to the list of columns in the api:info block.

<input name="rows@next" desc="Identifier for the next page of results. Do not set this value manually." />
Note that making this an input parameter instead of an attr parameter will prevent it from showing up in column listings. You will also need to set the 'EnablePaging' attribute to TRUE to turn off the driver's internal paging mechanism.
<api:set attr="EnablePaging" value="TRUE" />

The driver supports four types of paging implementations automatically. The first is when the URL of the next page is returned in the response. The second type involves specifying your current page offset in a query parameter. The third type is where you send the current page number in a query parameter. The fourth requires sending a page token in the query parameter of the next request. If your API utilizes one of these paging patterns, follow the below examples to implement paging:

Next Page URL

When the service returns the URL for the next page in the response body or header, set the 'pageurlpath' attribute to the location of this data. If a value is present at this location, it will be used to set the URL of the next request.

  • If the next page URL is passed in the response body, set 'pageurlpath' to the XPath of the element.
    <api:set  attr="pageurlpath"                             value="/data/nextPage" /> 
  • If the next page URL is passed in the response header with a 'Link' header, you can prefix 'pageurlpath' with header: to denote its location. If the returned 'Link' header value contains a partial URL path, you can set the 'BaseURI' attribute to the base URL that should be prepended to the url path returned in the 'Link' header.
    <api:set  attr="pageurlpath"                             value="header:Link" /> 

Paging Token

Sometimes a token is returned in the response body that should be passed to a paging parameter in the subsequent request. In this case, you can set the 'pagetokenparam' and 'pagetokenpath' attributes. 'pagetokenpath' should be set to the XPath of the element. Some services also send a variable that denotes whether or not there are more pages, if that is the case you can also set the 'hasmorepath' attribute to its XPath.

  • If the page token is passed in a query parameter, set 'pagetokenparam' to the name of the parameter.
    <api:set  attr="pagetokenpath"                        value="/data/token" /> 
    <api:set  attr="hasmorepath"                          value="/data/has_more" /> 
    <api:set  attr="pagetokenparam"                       value="nextpagetoken" /> 
    If has_more is true, this will pass the token at /data/token to the next query: ?nextpagetoken=<token>
  • Sometimes the token needs to be passed in the request body. You can do that by setting 'pagetokenparam' to its XPath.
    <api:set  attr="pagetokenpath"                        value="/request/nextpagetoken" /> 

Record Offset

If the service provides a record offset query parameter to control paging, you can implement this by setting the name of the offset query parameter, the name of the page size query parameter, and the page size to be passed. Note that the page size parameter does not need to be set if there is no parameter to control this. In this case, you must set pagesize to the default page size.


<api:set  attr="pageoffsetparam"                      value="offset" /> 
<api:set  attr="pagesizeparam"                        value="limit" /> 
<api:set  attr="pagesize"                             value="100" /> 

Page Number

Similar to record offset, if the service provides a query parameter that sets the page number, you can implement this by setting the name of the page number query parameter, the name of the page size query parameter, and the page size to be passed. Note that the page size parameter does not need to be set if there is no parameter to control this. In this case, you must set pagesize to the default page size.


<api:set  attr="pagenumberparam"                      value="page" /> 
<api:set  attr="pagesizeparam"                        value="pagesize" /> 
<api:set  attr="pagesize"                             value="100" /> 

Other Paging Types

If your API does not follow any of those paging patterns, you will need a custom paging implementation. This is done by setting the information needed from the first page in the 'Rows@Next' attribute. When the 'Rows@Next' value is set in the output, the connector will automatically call the method again with the 'Rows@Next' value in the input after it is finished returning results for this page. You can use the value of this input to modify the request on the next pass to get the next page of data. Set the Rows@Next input to any information needed to make the request for the next page of data.

For example, your API may return the next page's URL in the response. You can obtain this value by providing the XPath to the URL:

<api:set  attr="elementmappath#"  value="/next_page" />
<api:set  attr="elementmapname#"  value="rows@next" /> 
You can then modify the URL where the request is made, provided the value is set. Use the api:check element to first check if the Rows@Next input has a value. The Rows@Next input can be accessed as an attribute of the _input item:
<api:check attr="_input.rows@next">
  <api:set  attr="uri"  value="[_input.rows@next]" />
  <api:else>
    <api:set  attr="uri"  value="<first page's URL>" />
  </api:else>
<api:check> 

Process Other SELECT Statements Server Side

You can build any HTTP request in the GET method. Use the _query item to access other components of the SELECT query. In the GET method, the _query item has the following attributes that describe the query that was issued to the connector:

queryThe SQL query. For example:
SELECT Id, Name FROM Accounts WHERE City LIKE '%New%' AND COUNTRY = 'US' GROUP BY CreatedDate ORDER BY Name LIMIT 10,50;
selectcolumnsA comma-separated list containing the columns specified in the SELECT statement. For example, the Id and Name columns in the example.
tableThe table name specified in the SELECT statement. For example, Accounts in the example.
criteriaThe WHERE clause of the statement.
City LIKE '%New%' AND COUNTRY = 'US'
orderbyThe columns specified in the ORDER BY clause. For example, Name in the example.
groupbyThe GROUP BY clause in the SELECT statement. For example, CreatedDate in the example.
limitThe limit specified in the LIMIT or TOP clauses of the SELECT statement. For example, 50 in the example.
offsetThe offset specified in the LIMIT or TOP clauses of the SELECT statement. For example, 10 in the example.
isjoinWhether the query is a join.
jointableThe table to be joined.
isschemaonlyWhether the query retrieves only schema information.

Process LIMIT on the Server

If your API supports it, you can implement the LIMIT clause to restrict the number of results that need to be retrieved from the server.

Reference the value of the _query item's limit attribute when you build the API request. In the OData API, a limit can be specified with the $top query string parameter, shown below.

http://services.odata.org/V3/Northwind/Northwind.svc/Customers?$top=10

Below is the corresponding script:

<api:check attr="_query.limit">
  <api:set  attr="uri"      value="http://services.odata.org/V3/Northwind/Northwind.svc/Customers?$top=[_query.limit]" />
</api:check> 

Keyword Reference

See the API Script Reference for more information on the keywords used in this section:

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839