HDFS

Version 23.4.8843


HDFS


You can use the HDFS connector from the CData Sync application to capture data from HDFS and move it to any supported destination. To do so, you need to add the connector, authenticate to the connector, and complete your connection.

Establish a Connection

To allow Sync to use data from HDFS, you first must establish a connection to HDFS. Follow these steps to connect HDFS to your Sync account:

  1. Open the Connections page of the Sync dashboard.

  2. Click Add Connection to open the Select Connectors page.

  3. Click the Sources tab and locate the HDFS row.

  4. Click the Configure Connection icon at the end of that row. If you do not see the Configure Connection icon, you need to add the connector according to the instructions in Connections.

  5. Enter connection settings on the Settings tab:

    • Connection Name - Enter a connection name of your choice.

    • Host - Enter the host name of your HDFS server.

    • Port - Enter the port number for your HDFS installation. The default port value is 50070.

    • Auth Scheme - Select the authentication scheme. The default setting is Negotiate. For this setting, specify the following values:

      • User - Enter the username that you use to authenticate to your HDFS server.

      • Password - Enter the password that you use to authenticate to your HDFS server.

      • Kerberos KDC - Enter the Kerberos Key Distribution Center (KDC) service that you use to authenticate.

      • Kerberos Realm - Enter the Kerberos realm that you use to authenticate.

      • Kerberos SPN - Enter the service principal name (SPN) for the Kerberos Domain Controller. The SPN is composed of the service and host of the HDFS Kerberos principal (for example, ServiceName/MyHost@DomainName)

      • Kerberos Keytab File - Enter the path to your keytab file.

      • Kerberos Ticket Cache - Enter the full path to an MIT Kerberos credential cache file. Sync uses the specified cache file to obtain the Kerberos ticket that is required to connect to HDFS.

      • Path - Enter the HDFS path for your working directory.

  6. Click Create & Test to create the connection.

  7. Define advanced connection settings on the Advanced tab. (In most cases, though, you should not need these settings.)

More Information

For more information about interactions between CData Sync and HDFS, see HDFS Connector for CData Sync.