HDFS

Version 24.3.9120


HDFS


You can use the HDFS connector from the CData Sync application to capture data from HDFS and move it to any supported destination. To do so, you need to add the connector, authenticate to the connector, and complete your connection.

Add the HDFS Connector

To enable Sync to use data from HDFS, you first must add the connector, as follows:

  1. Open the Connections page of the Sync dashboard.

  2. Click Add Connection to open the Select Connectors page.

  3. Click the Sources tab and locate the HDFS row.

  4. Click the Configure Connection icon at the end of that row to open the New Connection page. If the Configure Connection icon is not available, click the Download Connector icon to install the HDFS connector. For more information about installing new connectors, see Connections.

Authenticate to HDFS

After you add the connector, you need to set the required properties.

  • Connection Name - Enter a connection name of your choice.

  • Host - Enter the host name of your HDFS server.

  • Port - Enter the port number for your HDFS installation. The default port value is 50070.

CData Sync supports authenticating to HDFS in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.

None

To connect without authentication, select None for Auth Scheme. No additional properties are required.

Negotiate

To connect with Kerberos credentials, specify the following properties:

  • Auth Scheme – Select Negotiate.

  • User - Enter the username that you use to authenticate to your HDFS server.

  • Password - Enter the password that you use to authenticate to your HDFS server.

  • Kerberos KDC - Enter the Kerberos Key Distribution Center (KDC) service that you use to authenticate.

  • Kerberos Realm - Enter the Kerberos realm that you use to authenticate.

  • Kerberos SPN - Enter the service principal name (SPN) for the Kerberos domain controller.

  • Kerberos Keytab File (optional) - Enter the path to the keytab file that contains your pairs of the Kerberos principals and encrypted keys.

  • Kerberos Ticket Cache (optional) - Enter the full path to an MIT Kerberos credential cache file. Sync uses the specified cache file to obtain the Kerberos ticket that is required to connect to HDFS.

Token

To connect with an access token, specify the following properties:

  • Auth Scheme – Select Token.

  • Access Token - Enter your HDFS access token.

Complete Your Connection

To complete your connection:

  1. (Optional) For Path, enter the HDFS path for your working directory.

  2. Define advanced connection settings on the Advanced tab. (In most cases, though, you should not need these settings.)

  3. Click Create & Test to create your connection.

More Information

For more information about interactions between CData Sync and HDFS, see HDFS Connector for CData Sync.