Among the agonies of these after days is that chief of torments — inarticulateness. What I learned and saw in those hours of impious exploration can never be told — for want of symbols or suggestions in any language.
The Shunned House, HP Lovecraft, 1924.
krb5.conf and system property
You can do two things when setting up the JVM binding to the
1. Change the realm with System Property
This system property sets the realm for the kerberos binding. This allows you to use a different one from the default in the krb5.conf file.
-Djava.security.krb5.realm=PRODUCTION System.setProperty("java.security.krb5.realm", "DEVELOPMENT");
The JVM property MUST be set before UGI is initialized.
2. Switch to an alternate
The JVM kerberos operations are configured via the
krb5.conf file specified in the JVM option
java.security.krb5.conf which can be done on the JVM command line, or inside the JVM
The JVM property MUST be set before UGI is initialized.
- use double backslash to escape paths on Windows platforms, e.g.
- Different JVMs (e.g. IBM JVM) want different fields in their
krb5.conffile. How can you tell? Kerberos will fail with a message
JVM Kerberos Library logging
You can turn Kerberos low-level logging on
This doesn't come out via Log4J, or
java.util logging; it just comes out on the console. Which is somewhat inconvenient —but bear in mind they are logging at a very low level part of the system. And it does at least log.
If you find yourself down at this level you are in trouble. Bear that in mind.
JVM SPNEGO Logging
If you want to debug what is happening in SPNEGO, another system property lets you enable this:
You can ask for both of these in the
HADOOP_OPTS environment variable
export HADOOP_OPTS=-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true
Hadoop-side JAAS debugging
Set the env variable
HADOOP_JAAS_DEBUG to true and UGI will set the "debug" flag on any JAAS
files it creates.
You can do this on the client, before issuing a
and set it in the environment script of a YARN service to turn it on there.
On the next Hadoop command, you'll see a trace like
[UnixLoginModule]: succeeded importing info: uid = 503 gid = 20 supp gid = 20 supp gid = 501 supp gid = 12 supp gid = 61 supp gid = 79 supp gid = 80 supp gid = 81 supp gid = 98 supp gid = 399 supp gid = 33 supp gid = 100 supp gid = 204 supp gid = 395 supp gid = 398 Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false Acquire TGT from Cache Principal is [email protected] [UnixLoginModule]: added UnixPrincipal, UnixNumericUserPrincipal, UnixNumericGroupPrincipal(s), to Subject Commit Succeeded [UnixLoginModule]: logged out Subject [Krb5LoginModule]: Entering logout [Krb5LoginModule]: logged out Subject [UnixLoginModule]: succeeded importing info: uid = 503 gid = 20 supp gid = 20 supp gid = 501 supp gid = 12 supp gid = 61 supp gid = 79 supp gid = 80 supp gid = 81 supp gid = 98 supp gid = 399 supp gid = 33 supp gid = 100 supp gid = 204 supp gid = 395 supp gid = 398 Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false Acquire TGT from Cache Principal is [email protected] [UnixLoginModule]: added UnixPrincipal, UnixNumericUserPrincipal, UnixNumericGroupPrincipal(s), to Subject Commit Succeeded
OS-level Kerberos Debugging
Starting MIT Kerberos v1.9, Kerberos libraries introduced a debug option which is a boon to any person breaking his/her head over a nasty Kerberos issue. It is also a good way to understand how does Kerberos library work under the hood. User can set an environment variable called
KRB5_TRACE to a filename or to
/dev/stdout and Kerberos programs (like kinit, klist and kvno etc.) as well as Kerberos libraries (libkrb5* ) will start printing more interesting details.
This is a very powerfull feature and can be used to debug any program which uses Kerberos libraries (e.g. CURL). It can also be used in conjunction with other debug options like
After setting this up in the terminal, the kinit command will produce something similar to this:
# kinit admin/admin Password for admin/[email protected]: # cat /tmp/kinit.log  1488484765.450285: Getting initial credentials for admin/[email protected]  1488484765.450556: Sending request (200 bytes) to MYKDC.COM  1488484765.450613: Resolving hostname sandbox.hortonworks.com  1488484765.450954: Initiating TCP connection to stream 172.17.0.2:88  1488484765.451060: Sending TCP request to stream 172.17.0.2:88  1488484765.461681: Received answer from stream 172.17.0.2:88  1488484765.461724: Response was not from master KDC  1488484765.461752: Processing preauth types: 19  1488484765.461764: Selected etype info: etype aes256-cts, salt "(null)", params ""  1488484765.461767: Produced preauth for next request: (empty)  1488484765.461771: Salt derived from principal: MYKDC.COMadminadmin  1488484765.461773: Getting AS key, salt "MYKDC.COMadminadmin", params ""  1488484770.985461: AS key obtained from gak_fct: aes256-cts/93FB  1488484770.985518: Decrypted AS reply; session key is: aes256-cts/2C56  1488484770.985531: FAST negotiation: available  1488484770.985555: Initializing FILE:/tmp/krb5cc_0 with default princ admin/[email protected]  1488484770.985682: Removing admin/[email protected] -> krbtgt/[email protected] from FILE:/tmp/krb5cc_0  1488484770.985688: Storing admin/[email protected] -> krbtgt/[email protected] in FILE:/tmp/krb5cc_0  1488484770.985742: Storing config in FILE:/tmp/krb5cc_0 for krbtgt/[email protected]: fast_avail: yes  1488484770.985758: Removing admin/[email protected] -> krb5_ccache_conf_data/fast_avail/krbtgt\/MYKDC.COM\@[email protected]: from FILE:/tmp/krb5cc_0  1488484770.985763: Storing admin/[email protected] -> krb5_ccache_conf_data/fast_avail/krbtgt\/MYKDC.COM\@[email protected]: in FILE:/tmp/krb5cc_0
The environment variable
As the docs say:
If the KRB5CCNAME environment variable is set, its value is used to name the default ticket cache.
IP addresses vs. Hostnames
Kerberos principals are traditionally defined with hostnames of the form
[email protected]/EXAMPLE.COM, not
The issue of whether Hadoop should support IP addresses has been raised HADOOP-9019 & HADOOP-7510 Current consensus is no: you need DNS set up, or at least a consistent and valid /etc/hosts file on every node in the cluster.
- Windows does not reverse-DNS 127.0.0.1 to localhost or the local machine name; this can cause problems with MiniKDC tests in Windows, where adding a
user/[email protected]principal will be needed example.
- Windows hostnames are often upper case.
Kerberos's defences against replay attacks
From the javadocs of
/** * If multiple clients with the same principal try to connect to the same * server at the same time, the server assumes a replay attack is in * progress. This is a feature of kerberos. In order to work around this, * what is done is that the client backs off randomly and tries to initiate * the connection again. */
That's a good defence on the surface, "multiple connections from same principal == attack", which
doesn't scale to Hadoop clusters. Hence the sleep. It is also why large Hadoop clusters define
a different principal for every service/host pair in the keytab, ensuring giving the principal
for the HDFS blockserver on host1 an identity such as
hdfs/host1, for host 2
When a cluster is completely restarted, instead of the same principal trying to authenticate from
1000+ hosts, only the HDFS services on a single node try to authenticate as the same principal.
Asymmetric Kerberos Realm Trust
It is possible to configure Kerberos KDCs such that one realm, e.g
can trust principals from a remote realm -but for that
remote realm not to trust the principals from that
What does that permit? It means that a Hadoop-cluster-specific KDC can be created and configured
to trust principals from the enterprise-wide (Active-Directory Managed) KDC infrastructure.
The hadoop cluster KDC will contain the principals for the various services, with these exported
As a result, even if the keytabs are compromised, *they do not grant any access to and enterprise-wide kerberos-authenticated services.