This section shows how to use the connector to authenticate to Spark SQL using Kerberos.
Authenticating with Kerberos
To authenticate to Spark SQL using Kerberos, set the following properties:
- KerberosKDC: Set this to the host name or IP Address of your Kerberos KDC machine.
- KerberosSPN: Set this to the service and host of the Spark SQL Kerberos Principal. This will be the value prior to the '@' symbol (for instance, ) of the (for instance, ).
Retrieve the Kerberos Ticket
You can use one of the following options to retrieve the required Kerberos ticket.
MIT Kerberos Credential Cache File
This option enables you to use the MIT Kerberos Ticket Manager or kinit command to get tickets. Note that you won't need to set the User or Password connection properties with this option.
- Ensure that you have an environment variable created called KRB5CCNAME.
- Set the KRB5CCNAME environment variable to a path pointing to your credential cache file (for instance, C:\krb_cache\krb5cc_0 or /tmp/krb5cc_0). This file will be created when generating your ticket with MIT Kerberos Ticket Manager.
- To obtain a ticket, open the MIT Kerberos Ticket Manager application, click Get Ticket, enter your principal name and password, then click OK. If successful, ticket information will appear in Kerberos Ticket Manager and will now be stored in the credential cache file.
- Now that the credential cache file has been created, the connector will use the cache file to obtain the kerberos ticket to connect to Spark SQL.
As an alternative to setting the KRB5CCNAME environment variable, you can directly set the file path using the KerberosTicketCache property. When set, the connector will use the specified cache file to obtain the kerberos ticket to connect to Spark SQL.
If the KRB5CCNAME environment variable has not been set, you can retrieve a Kerberos ticket using a Keytab File. To do this, set the User property to the desired username and set the KerberosKeytabFile property to a file path pointing to the keytab file associated with the user.
User and Password
If both the KRB5CCNAME environment variable and the KerberosKeytabFile property have not been set, you can retrieve a ticket using a User and Password combination. To do this, set the User and Password properties to the user/password combo that you use to authenticate with Spark SQL.
More complex Kerberos environments may require cross-realm authentication where multiple realms and KDC servers are used (e.g. where one realm/KDC is used for user authentication and another realm/KDC used for obtaining the service ticket).
In such an environment, the KerberosRealm and KerberosKDC properties can be set to the values required for user authentication. The KerberosServiceRealm and KerberosServiceKDC properties can be set to the values required to obtain the service ticket.