Low-Level Secrets

Among the agonies of these after days is that chief of torments — inarticulateness. What I learned and saw in those hours of impious exploration can never be told — for want of symbols or suggestions in any language.

The Shunned House, HP Lovecraft, 1924.

krb5.conf and system property java.security.krb5.conf

You can do two things when setting up the JVM binding to the krb5.conf kerberos binding file.

1. Change the realm with System Property java.security.krb5.realm

This system property sets the realm for the kerberos binding. This allows you to use a different one from the default in the krb5.conf file.

Examples

-Djava.security.krb5.realm=PRODUCTION

System.setProperty("java.security.krb5.realm", "DEVELOPMENT");

The JVM property MUST be set before UGI is initialized.

2. Switch to an alternate krb5.conf file.

The JVM kerberos operations are configured via the krb5.conf file specified in the JVM option java.security.krb5.conf which can be done on the JVM command line, or inside the JVM

System.setProperty("java.security.krb5.conf", krbfilepath);

The JVM property MUST be set before UGI is initialized.

Notes

  • use double backslash to escape paths on Windows platforms, e.g. C:\\keys\\key1, or \\\\server4\\shared\\tokens
  • Different JVMs (e.g. IBM JVM) want different fields in their krb5.conf file. How can you tell? Kerberos will fail with a message

JVM Kerberos Library logging

You can turn Kerberos low-level logging on

-Dsun.security.krb5.debug=true

This doesn't come out via Log4J, or java.util logging; it just comes out on the console. Which is somewhat inconvenient —but bear in mind they are logging at a very low level part of the system. And it does at least log. If you find yourself down at this level you are in trouble. Bear that in mind.

JVM SPNEGO Logging

If you want to debug what is happening in SPNEGO, another system property lets you enable this:

-Dsun.security.spnego.debug=true

You can ask for both of these in the HADOOP_OPTS environment variable

export HADOOP_OPTS=-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true

Hadoop-side JAAS debugging

Set the env variable HADOOP_JAAS_DEBUG to true and UGI will set the "debug" flag on any JAAS files it creates.

You can do this on the client, before issuing a hadoop, hdfs or yarn command, and set it in the environment script of a YARN service to turn it on there.

export HADOOP_JAAS_DEBUG=true

On the next Hadoop command, you'll see a trace like

    [UnixLoginModule]: succeeded importing info: 
      uid = 503
      gid = 20
      supp gid = 20
      supp gid = 501
      supp gid = 12
      supp gid = 61
      supp gid = 79
      supp gid = 80
      supp gid = 81
      supp gid = 98
      supp gid = 399
      supp gid = 33
      supp gid = 100
      supp gid = 204
      supp gid = 395
      supp gid = 398
Debug is  true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Acquire TGT from Cache
Principal is stevel@COTHAM
    [UnixLoginModule]: added UnixPrincipal,
        UnixNumericUserPrincipal,
        UnixNumericGroupPrincipal(s),
       to Subject
Commit Succeeded 

    [UnixLoginModule]: logged out Subject
    [Krb5LoginModule]: Entering logout
    [Krb5LoginModule]: logged out Subject
    [UnixLoginModule]: succeeded importing info: 
      uid = 503
      gid = 20
      supp gid = 20
      supp gid = 501
      supp gid = 12
      supp gid = 61
      supp gid = 79
      supp gid = 80
      supp gid = 81
      supp gid = 98
      supp gid = 399
      supp gid = 33
      supp gid = 100
      supp gid = 204
      supp gid = 395
      supp gid = 398
Debug is  true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Acquire TGT from Cache
Principal is stevel@COTHAM
    [UnixLoginModule]: added UnixPrincipal,
        UnixNumericUserPrincipal,
        UnixNumericGroupPrincipal(s),
       to Subject
Commit Succeeded 

OS-level Kerberos Debugging

Starting MIT Kerberos v1.9, Kerberos libraries introduced a debug option which is a boon to any person breaking his/her head over a nasty Kerberos issue. It is also a good way to understand how does Kerberos library work under the hood. User can set an environment variable called KRB5_TRACE to a filename or to /dev/stdout and Kerberos programs (like kinit, klist and kvno etc.) as well as Kerberos libraries (libkrb5* ) will start printing more interesting details.

This is a very powerfull feature and can be used to debug any program which uses Kerberos libraries (e.g. CURL). It can also be used in conjunction with other debug options like HADOOP_JAAS_DEBUG and sun.security.krb5.debug.

export KRB5_TRACE=/tmp/kinit.log

After setting this up in the terminal, the kinit command will produce something similar to this:

# kinit admin/admin
Password for admin/[email protected]:

# cat /tmp/kinit.log
[5709] 1488484765.450285: Getting initial credentials for admin/[email protected]
[5709] 1488484765.450556: Sending request (200 bytes) to MYKDC.COM
[5709] 1488484765.450613: Resolving hostname sandbox.hortonworks.com
[5709] 1488484765.450954: Initiating TCP connection to stream 172.17.0.2:88
[5709] 1488484765.451060: Sending TCP request to stream 172.17.0.2:88
[5709] 1488484765.461681: Received answer from stream 172.17.0.2:88
[5709] 1488484765.461724: Response was not from master KDC
[5709] 1488484765.461752: Processing preauth types: 19
[5709] 1488484765.461764: Selected etype info: etype aes256-cts, salt "(null)", params ""
[5709] 1488484765.461767: Produced preauth for next request: (empty)
[5709] 1488484765.461771: Salt derived from principal: MYKDC.COMadminadmin
[5709] 1488484765.461773: Getting AS key, salt "MYKDC.COMadminadmin", params ""
[5709] 1488484770.985461: AS key obtained from gak_fct: aes256-cts/93FB
[5709] 1488484770.985518: Decrypted AS reply; session key is: aes256-cts/2C56
[5709] 1488484770.985531: FAST negotiation: available
[5709] 1488484770.985555: Initializing FILE:/tmp/krb5cc_0 with default princ admin/[email protected]
[5709] 1488484770.985682: Removing admin/[email protected] -> krbtgt/[email protected] from FILE:/tmp/krb5cc_0
[5709] 1488484770.985688: Storing admin/[email protected] -> krbtgt/[email protected] in FILE:/tmp/krb5cc_0
[5709] 1488484770.985742: Storing config in FILE:/tmp/krb5cc_0 for krbtgt/[email protected]: fast_avail: yes
[5709] 1488484770.985758: Removing admin/[email protected] -> krb5_ccache_conf_data/fast_avail/krbtgt\/MYKDC.COM\@MYKDC.COM@X-CACHECONF: from FILE:/tmp/krb5cc_0
[5709] 1488484770.985763: Storing admin/[email protected] -> krb5_ccache_conf_data/fast_avail/krbtgt\/MYKDC.COM\@MYKDC.COM@X-CACHECONF: in FILE:/tmp/krb5cc_0

KRB5CCNAME

The environment variable KRB5CCNAME As the docs say:

If the KRB5CCNAME environment variable is set, its value is used to name the default ticket cache.

IP addresses vs. Hostnames

Kerberos principals are traditionally defined with hostnames of the form hbase@worker3/EXAMPLE.COM, not hbase/10.10.15.1/EXAMPLE.COM

The issue of whether Hadoop should support IP addresses has been raised HADOOP-9019 & HADOOP-7510 Current consensus is no: you need DNS set up, or at least a consistent and valid /etc/hosts file on every node in the cluster.

Windows

  1. Windows does not reverse-DNS 127.0.0.1 to localhost or the local machine name; this can cause problems with MiniKDC tests in Windows, where adding a user/127.0.0.1@REALM principal will be needed example.
  2. Windows hostnames are often upper case.

Kerberos's defences against replay attacks

From the javadocs of org.apache.hadoop.ipc.Client.handleSaslConnectionFailure():

/**
 * If multiple clients with the same principal try to connect to the same
 * server at the same time, the server assumes a replay attack is in
 * progress. This is a feature of kerberos. In order to work around this,
 * what is done is that the client backs off randomly and tries to initiate
 * the connection again.
 */

That's a good defence on the surface, "multiple connections from same principal == attack", which doesn't scale to Hadoop clusters. Hence the sleep. It is also why large Hadoop clusters define a different principal for every service/host pair in the keytab, ensuring giving the principal for the HDFS blockserver on host1 an identity such as hdfs/host1, for host 2 hdfs/host2, etc. When a cluster is completely restarted, instead of the same principal trying to authenticate from 1000+ hosts, only the HDFS services on a single node try to authenticate as the same principal.

Asymmetric Kerberos Realm Trust

It is possible to configure Kerberos KDCs such that one realm, e.g "hadoop-kdc" can trust principals from a remote realm -but for that remote realm not to trust the principals from that "hadoop-kdc" realm. What does that permit? It means that a Hadoop-cluster-specific KDC can be created and configured to trust principals from the enterprise-wide (Active-Directory Managed) KDC infrastructure. The hadoop cluster KDC will contain the principals for the various services, with these exported into keytabs.

As a result, even if the keytabs are compromised, *they do not grant any access to and enterprise-wide kerberos-authenticated services.

results matching ""

    No results matching ""