The CData Python Connector for HDFS models HDFS objects as relational tables and views. HDFS objects have relationships to other objects; in the tables, these relationships are expressed through foreign keys. The following sections show the available API objects and provide more information on executing SQL to HDFS APIs.
Schemas for most database objects are defined in simple, text-based configuration files.
- The provider models HDFS entities such as files and permissions as relational views, allowing you to write SQL to query HDFS data.
- Stored procedures allow you to execute operations to HDFS
- Live connectivity to these objects means any changes to your HDFS account are immediately reflected when using the provider.
The provider offloads as much of the SELECT statement processing as possible to the HDFS APIs and then processes the rest of the query in memory. See SupportEnhancedSQL for more information on how the provider circumvents API limitations with in-memory client-side processing.