The MongoDB Connector is used to read data from a MongoDB Server and make it available in the Data Virtuality Server in the relational form. In this way, users can query a MongoDB Server via SQL query language and also write complex queries that are directed to multiple different Data Sources, e.g. a single SQL query can join tables from MySQL and combine them with collections (that are represented as tables) coming from MongoDB.
Connector Configuration
Connecting to MongoDB
Before issuing the SQL queries to the MongoDB Server, we need to represent Mongo's Metadata in relational format, i.e. represent non-relation collections as relation tables etc. This is done by the connecting to MongoDB data source using the createConnection
/createDatasource
commands.
The complete list of connection properties can be found in Connection Configuration Interface.
The MongoDB specific connection properties options
and readMode
, translator property queryOption
model properties (all) are explained in the sections below.
Example
CALL
"SYSADMIN.createConnection"
(
"name"
=>
'mongodb'
,
"jbossCLITemplateName"
=>
'mongodb'
,
"connectionOrResourceAdapterProperties"
=>
'server=localhost
,port=27019
,database=test
,options="connectionsPerHost=51,description=test,alwaysUseMBeans=true"'
,
"encryptedProperties"
=>
null
) ;;
CALL
"SYSADMIN.createDataSource"
(
"name"
=>
'mongodb'
,
"translator"
=>
'mongodb'
,
"modelProperties"
=>
'importer.useFullSchemaName=false
,importer.TableTypes="TABLE,VIEW"
,mongodb.maxColumnsNumber=200
,mongodb.maxRowsNumber=100
,mongodb.maxVarcharColumnWidth=256
,mongodb.maxRecursionDepth=2
,mongodb.fieldDelimiterChar="_"'
,
"translatorProperties"
=>
'queryOption="queryoption_notimeout"'
,
"encryptedModelProperties"
=>
null
,
"encryptedTranslatorProperties"
=>
null
) ;;
MongoDB Driver Options
MongoDB driver options may be provided as a list of comma-separated properties via the options
connection property.
Supported MongoDB Driver Options
Option name | Option type | Default value |
---|---|---|
| string | |
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| boolean |
|
| boolean |
|
| boolean |
|
| boolean |
|
| int |
|
| int |
|
| int |
|
| int |
|
| int |
|
| string | |
| boolean |
|
Important
The list of all possible options are supported dynamically by MongoDB Connector and can be extended in future versions of MongoDB driver automatically.
MongoDB readMode
The MongoDB readMode may be provided via the readMode
connection property.
The optional readMode
parameter specifies how MongoDB connection routes read operations to the members of a replica set (a cluster of MongoDB servers that implements master-slave replication and automated failover; MongoDB’s recommended replication strategy). By default, a connection directs its read operations to the primary member (the primary member is the current master instance, which receives all write operations) in a replica set. Because write operations are issued to the single primary, reading from the primary returns the latest version of a document. For a connection that does not require fully up-to-date data, you can improve read throughput or reduce latency by distributing some or all reads to secondary members of the replica set.
MongoDB driver supports five read modes:
Mode | Description |
---|---|
primary | Default mode. All operations read from the current replica set primary |
primaryPreferred | In most situations, operations read from the primary but if it is unavailable, operations read from secondary members |
secondary | All operations read from the secondary members of the replica set |
secondaryPreferred | In most situations, operations read from secondary members but if no secondary members are available, operations read from the primary |
nearest | Operations read from the member of the replica set with the least network latency, irrespective of the member’s type |
All read preference modes except primary may return stale data because secondaries replicate operations from the primary with some delay.
MongoDB Query Options
The MongoDB specific query options can be set via MongoDB queryOption
translator property.
The following options are supported:
QUERYOPTION_TAILABLE
QUERYOPTION_SLAVEOK
QUERYOPTION_OPLOGREPLAY
QUERYOPTION_NOTIMEOUT
QUERYOPTION_AWAITDATA
QUERYOPTION_EXHAUST
QUERYOPTION_PARTIAL
For details about these options, please see MongoDB documentation.
MongoDB Model Properties
MongoDB model properties may be provided via modelProperies in the createDatasource
statement.
Importing Collection Fields
mongodb.maxColumnsNumber
,mongodb.maxRowsNumber
To represent a non-relation MongoDB collection as a relational table, we need to represent the semi-structured collection rows as structured relational rows. To do this completely, we need to check all the fields in all rows of a MongoDB's collection and to add them as columns to a relation table representing that collection. Since extracting MongoDB's collection metadata in this way could be a very costly operation on big collections, Data Virtuality Server iterates only through the maximal number of rows (defined in mongodb.maxRowsNumber
) and adds at most maxColumnsNumber
of columns to the resulting relational table.
In addition, the same-named collection field can have a different format in different rows. The Data Virtuality Server first takes to declare the column type as it has found it first in the field. If it finds out that the field type has changed, it changes the resulting column type to the "bigger" type if needed.
In general, that a type widening goes from a Numeric type to String to CLOB. Default: maxColumnsNumber
= 200
and maxRowsNumber
= 100
. *
mongodb.maxVarcharColumnWidth
States, which column length should be used for varchar columns, e.g. sets varchar (mongodb.maxVarcharColumnWidth
). Default value: 256
.
mongodb.overrideStructureByExample
On connecting or refreshing of MongoDB data source, the Data Virtuality Server always reads the collections to determine their structure. Sometimes the collections are extremely large, but still need to be read completely, because the latest documents have the latest version of the schema. This operation can take a large amount of time and decrease it we introduced the mongodb.overrideStructureByExample
model property. This property can hold a potentially very large text, which represents a strict mode JSON document containing one or multiple comma-separated examples of the structure of MongoDB collections to be overridden. These documents will be read instead of the documents from the real collections to determine their structure.
Creation and usage of the current property explained in detail in the Usage section of this page below.
Nested Objects
mongodb.maxRecursionDepth
Nested Objects in a MongoDB's collection row can be represented either as a JSON string contained in the table's column or them as additional fields in the resulting table. This parameter controls, how far inside the particular fields should be analyzed to represent their sub-fields as separate columns in the resulting table. Default value: 0
(output the complete JSON string with no additional columns)
mongodb.fieldDelimiterChar
Additional columns generated for the nested objects have the name of their parent objects as a prefix inside. This parameter controls the character, that is used to separate this prefix from the field name itself. Default: '_'.
Usage
The MongoDB's collections are represented as relational tables in the schema named like the name given in createDatasource()
.
SELECT Query
To query these tables, a normal SQL select query like the following can be used:
SELECT * FROM mongodb.table1
Stored Procedures
Alternatively, you can use the stored procedure named query to pass a native query inside. This stored procedure has the following parameters:
Parameter | Description |
---|---|
collection | Name of the collection to query |
query | Query in the native JSON format |
columns | Output columns in the JSON format like |
result | CLOB containing the results |
Example
CALL
"mongodb"
.query(
'table1'
,
'{"city":"berlin"}'
,NULL)
XML Functions
In order to parse the JSON string values in a field, you can use the following XML Functions:
jsonToXml
xpathValue
xmlserialize
Examples
1. jsonToXml
SELECT CAST(jsontoxml(
'root'
,
"zipcodes"
) AS string) FROM
"mongodb"
.
"zip_history_log"
;
2. xmltable
andjsonToXml
SELECT * FROM (CALL mongo.query(
'foo'
,
'{ $query : {},$orderby : {} }'
,
'{_id:1,desription:1}'
)) w, xmltable(
'/root'
passing jsontoxml(
'root'
,w.result)) t
2. xpathValue
SELECT xpathvalue( jsontoxml(
'test'
, a.result),
'/test/test/[2]/firstname'
), xmlserialize( jsontoxml(
'test'
, a.result ) ), result FROM ( call
"mongodb"
.query(
'table1'
,
'{"city":"berlin"}'
, NULL) ) AS a;
SELECT CAST(xpathvalue(jsontoxml(
'root'
,
"zipcodes"
),
'/root/root/[1]'
) AS string) FROM
"mongodb"
.
"zip_history_log"
;
Creation and Usage of the JSON Examples with the mongodb.overrideStructureByExample Property
In order to create JSON example for CollectionA
in database test
containing such documents:
{
"_id"
: ObjectId(
"57ea8fc806f2c1c32bcc4a11"
),
"col1"
: 10 } {
"_id"
: ObjectId(
"57ea8fd106f2c1c32bcc4a12"
),
"col1"
: NumberLong(200) }
We can perform the following steps:
Document with the desired structure for the whole collection should be obtained using the mongoexport CLI tool or constructed manually using strict mode JSON syntax. For CollectionA, the second document can be used, as a data type of "col1" is wider than the first one. To do it with mongoexport, one of the following commands can be used:
mongoexport -
-db
test -
-collection
CollectionA -
-query
"{'col1': {$type: 'long'}}"
mongoexport -
-db
test -
-collection
CollectionA -
-query
"{'col1': 200}"
The result will be this JSON:
{
"_id"
:{
"$oid"
:
"57ea8fd106f2c1c32bcc4a12"
},
"col1"
:{
"$numberLong"
:
"200"
}}
Collection name should be added as a key to the JSON document obtained on the previous step:
{
"CollectionA"
: {
"_id"
:{
"$oid"
:
"57ea8fd106f2c1c32bcc4a12"
},
"col1"
:{
"$numberLong"
:
"200"
}} }
All double quotes should be escaped with the backslash symbol:
{ \"CollectionA\": {\"_id\":{\"$oid\":\"57ea8fd106f2c1c32bcc4a12\"},\"col1\":{\"$numberLong\":\"200\"}} }
Resulting document now can be set as a value to the
mongodb.overrideStructureByExample
property:mongodb.overrideStructureByExample=
" { \"CollectionA\": {\"_id\":{\"$oid\":\"57ea8fd106f2c1c32bcc4a12\"},\"col1\":{\"$numberLong\":\"200\"}} } "
Multiple examples should be separated by a comma:
mongodb.overrideStructureByExample=
" { \"CollectionA\": {\"_id\":{\"$oid\":\"57ea8fd106f2c1c32bcc4a12\"},\"col1\":{\"$numberLong\":\"200\"}}, \"CollectionB\": {\"_id\":{\"$oid\":\"57dfffcf190889d653a5374b\"},\"col2\":{\"$date\":\"2012-12-19T06:01:17.171Z\"}} } "
The example above can be used to add the data source as follows:
CALL SYSADMIN.createConnection
(
"name"
=>
'mongodb'
,
"jbossCLITemplateName"
=>
'mongodb'
,
"connectionOrResourceAdapterProperties"
=>
'server=localhost,port=27017,database=test'
) ;;
CALL SYSADMIN.createDatasource
(
"name"
=>
'mongodb'
,
"translator"
=>
'mongodb'
,
"modelProperties"
=>
'mongodb.overrideStructureByExample="
{
\"CollectionA\":
{\"_id\":{\"$oid\":\"57ea8fd106f2c1c32bcc4a12\"},\"col1\":{\"$numberLong\":\"200\"}},
\"CollectionB\":
{\"_id\":{\"$oid\":\"57dfffcf190889d653a5374b\"},\"col2\":{\"$date\":\"2012-12-19T06:01:17.171Z\"}}
}
"'
,
"translatorProperties"
=>
''
) ;;