HDFS
Version 24.3.9120
Version 24.3.9120
HDFS
You can use the HDFS connector from the CData Sync application to capture data from HDFS and move it to any supported destination. To do so, you need to add the connector, authenticate to the connector, and complete your connection.
Add the HDFS Connector
To enable Sync to use data from HDFS, you first must add the connector, as follows:
-
Open the Connections page of the Sync dashboard.
-
Click Add Connection to open the Select Connectors page.
-
Click the Sources tab and locate the HDFS row.
-
Click the Configure Connection icon at the end of that row to open the New Connection page. If the Configure Connection icon is not available, click the Download Connector icon to install the HDFS connector. For more information about installing new connectors, see Connections.
Authenticate to HDFS
After you add the connector, you need to set the required properties.
-
Connection Name - Enter a connection name of your choice.
-
Host - Enter the host name of your HDFS server.
-
Port - Enter the port number for your HDFS installation. The default port value is 50070.
CData Sync supports authenticating to HDFS in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.
None
To connect without authentication, select None for Auth Scheme. No additional properties are required.
Negotiate
To connect with Kerberos credentials, specify the following properties:
-
Auth Scheme – Select Negotiate.
-
User - Enter the username that you use to authenticate to your HDFS server.
-
Password - Enter the password that you use to authenticate to your HDFS server.
-
Kerberos KDC - Enter the Kerberos Key Distribution Center (KDC) service that you use to authenticate.
-
Kerberos Realm - Enter the Kerberos realm that you use to authenticate.
-
Kerberos SPN - Enter the service principal name (SPN) for the Kerberos domain controller.
-
Kerberos Keytab File (optional) - Enter the path to the keytab file that contains your pairs of the Kerberos principals and encrypted keys.
-
Kerberos Ticket Cache (optional) - Enter the full path to an MIT Kerberos credential cache file. Sync uses the specified cache file to obtain the Kerberos ticket that is required to connect to HDFS.
Token
To connect with an access token, specify the following properties:
-
Auth Scheme – Select Token.
-
Access Token - Enter your HDFS access token.
Complete Your Connection
To complete your connection:
-
(Optional) For Path, enter the HDFS path for your working directory.
-
Define advanced connection settings on the Advanced tab. (In most cases, though, you should not need these settings.)
-
Click Create & Test to create your connection.
More Information
For more information about interactions between CData Sync and HDFS, see HDFS Connector for CData Sync.