Custom Schema Definitions
You can extend the table schemas created with Automatic Schema Discovery by saving them into schema files. The schema files have a simple format that makes the schemas to edit.
Generating Schema Files
Set GenerateSchemaFiles to "OnStart" to persist schemas for all tables when you connect. You can also generate table schemas as needed: Set GenerateSchemaFiles to "OnUse" and execute a SELECT query to the table.
For example, consider a schema for the restaurants data set. This is a sample data set provided by MongoDB. To download the data set, follow the Getting Started with MongoDB guide.
Below is an example document from the collection:
{ "address":{ "building":"461", "coord":[ -74.138492, 40.631136 ], "street":"Port Richmond Ave", "zipcode":"10302" }, "borough":"Staten Island", "cuisine":"Other", "name":"Indian Oven", "restaurant_id":"50018994" }
Importing the MongoDB Restaurant Data Set
You can use the mongoimport utility to import the data set:
mongoimport --db test --collection restaurants --drop --file dataset.json
Customizing a Schema
When GenerateSchemaFiles is set, the component saves schemas into the folder specified by the Location property. You can then change column behavior in the resulting schema.
The following schema uses the other:bsonpath property to define where in the collection to retrieve the data for a particular column. Using this model you can flatten arbitrary levels of hierarchy.
The collection attribute specifies the collection to parse. The collection attribute gives you the flexibility to use multiple schemas for the same collection. If collection is not specified, the filename determines the collection that is parsed.
Below are the column definitions and the collection to extract the column values from. In Custom Schema Example, you will find the complete schema.
<rsb:script xmlns:rsb="http://www.rssbus.com/ns/rsbscript/2">
<rsb:info title="StaticRestaurants" description="Custom Schema for the MongoDB restaurants data set.">
<!-- Column definitions -->
<attr name="borough" xs:type="string" other:bsonpath="$.borough" />
<attr name="cuisine" xs:type="string" other:bsonpath="$.cuisine" />
<attr name="building" xs:type="string" other:bsonpath="$.address.building" />
<attr name="street" xs:type="string" other:bsonpath="$.address.street" />
<attr name="latitude" xs:type="double" other:bsonpath="$.address.coord.0" />
<attr name="longitude" xs:type="double" other:bsonpath="$.address.coord.1" />
</rsb:info>
<rsb:set attr="collection" value="restaurants"/>
</rsb:script>