Cloudera Enterprise 6.3.x | Other versions

Kafka Security

Client-Broker Security with TLS

Kafka allows clients to connect over TLS. By default, TLS is disabled, but can be turned on as needed.

Step 1: Generating Keys and Certificates for Kafka Brokers

Generate the key and the certificate for each machine in the cluster using the Java keytool utility. See Generate TLS Certificates.

Make sure that the common name (CN) matches the fully qualified domain name (FQDN) of your server. The client compares the CN with the DNS domain name to ensure that it is connecting to the correct server.

Step 2: Creating Your Own Certificate Authority

You have generated a public-private key pair for each machine and a certificate to identify the machine. However, the certificate is unsigned, so an attacker can create a certificate and pretend to be any machine. Sign certificates for each machine in the cluster to prevent unauthorized access.

A Certificate Authority (CA) is responsible for signing certificates. A CA is similar to a government that issues passports. A government stamps (signs) each passport so that the passport becomes difficult to forge. Similarly, the CA signs the certificates, and the cryptography guarantees that a signed certificate is computationally difficult to forge. If the CA is a genuine and trusted authority, the clients have high assurance that they are connecting to the authentic machines.

openssl req -new -x509 -keyout ca-key -out ca-cert -days 365

The generated CA is a public-private key pair and certificate used to sign other certificates.

Add the generated CA to the client truststores so that clients can trust this CA:

keytool -keystore {client.truststore.jks} -alias CARoot -import -file {ca-cert}
  Note: If you configure Kafka brokers to require client authentication by setting ssl.client.auth to be requested or required on the Kafka brokers config, you must provide a truststore for the Kafka brokers as well. The truststore must have all the CA certificates by which the clients keys are signed. The keystore created in step 1 stores each machine’s own identity. In contrast, the truststore of a client stores all the certificates that the client should trust. Importing a certificate into a truststore means trusting all certificates that are signed by that certificate. This attribute is called the chain of trust. It is particularly useful when deploying SSL on a large Kafka cluster. You can sign all certificates in the cluster with a single CA, and have all machines share the same truststore that trusts the CA. That way, all machines can authenticate all other machines.

Step 3: Signing the Certificate

Now you can sign all certificates generated by step 1 with the CA generated in step 2.

  1. Create a certificate request from the keystore:
    keytool -keystore server.keystore.jks -alias localhost -certreq -file cert-file

    where:

    • keystore: the location of the keystore
    • cert-file: the exported, unsigned certificate of the server
  2. Sign the resulting certificate with the CA (in the real world, this can be done using a real CA):
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file -out cert-signed -days validity -CAcreateserial -passin pass:ca-password

    where:

    • ca-cert: the certificate of the CA
    • ca-key: the private key of the CA
    • cert-signed: the signed certificate of the server
    • ca-password: the passphrase of the CA
  3. Import both the certificate of the CA and the signed certificate into the keystore:
    keytool -keystore server.keystore.jks -alias CARoot -import -file ca-cert
    keytool -keystore server.keystore.jks -alias localhost -import -file cert-signed

The following Bash script demonstrates the steps described above. One of the commands assumes a password of SamplePassword123, so either use that password or edit the command before running it.

#!/bin/bash
#Step 1
keytool -keystore server.keystore.jks -alias localhost -validity 365 -genkey
#Step 2
openssl req -new -x509 -keyout ca-key -out ca-cert -days 365
keytool -keystore server.truststore.jks -alias CARoot -import -file ca-cert
keytool -keystore client.truststore.jks -alias CARoot -import -file ca-cert
#Step 3
keytool -keystore server.keystore.jks -alias localhost -certreq -file cert-file
openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file -out cert-signed -days 365 -CAcreateserial -passin pass:SamplePassword123
keytool -keystore server.keystore.jks -alias CARoot -import -file ca-cert
keytool -keystore server.keystore.jks -alias localhost -import -file cert-signed

Step 4: Configuring Kafka Brokers

Kafka Brokers support listening for connections on multiple ports. If SSL is enabled for inter-broker communication (see below for how to enable it), both PLAINTEXT and SSL ports are required.

To configure the listeners from Cloudera Manager, perform the following steps:

  1. In Cloudera Manager, go to Kafka > Instances.
  2. Go to Kafka Broker > Configurations.
  3. In the Kafka Broker Advanced Configuration Snippet (Safety Valve) for Kafka Properties, enter the following information:
    listeners=PLAINTEXT://kafka-broker-host-name:9092,SSL://kafka-broker-host-name:9093
    advertised.listeners=PLAINTEXT://kafka-broker-host-name:9092,SSL://kafka-broker-host-name:9093

    where kafka-broker-host-name is the FQDN of the broker that you selected from the Instances page in Cloudera Manager. In the above sample configurations we used PLAINTEXT and SSL protocols for the SSL enabled brokers.

    For information about other supported security protocols, see Using Kafka’s Inter-Broker Security.

  4. Repeat the previous step for each broker.

    The advertised.listeners configuration is needed to connect the brokers from external clients.

  5. Deploy the above client configurations and rolling restart the Kafka service from Cloudera Manager.

Kafka CSD auto-generates listeners for Kafka brokers, depending on your SSL and Kerberos configuration. To enable SSL for Kafka installations, do the following:

  1. Turn on SSL for the Kafka service by turning on the ssl_enabled configuration for the Kafka CSD.
  2. Set security.inter.broker.protocol as SSL, if Kerberos is disabled; otherwise, set it as SASL_SSL.

The following SSL configurations are required on each broker. Each of these values can be set in Cloudera Manager. Be sure to replace this example with the truststore password.

For instructions, see Changing the Configuration of a Service or Role Instance.

ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks
ssl.keystore.password=SamplePassword123
ssl.key.password=SamplePassword123
ssl.truststore.location=/var/private/ssl/server.truststore.jks
ssl.truststore.password=SamplePassword123

Other configuration settings may also be needed, depending on your requirements:

  • ssl.client.auth=none: Other options for client authentication are required, or requested, where clients without certificates can still connect. The use of requested is discouraged, as it provides a false sense of security and misconfigured clients can still connect.
  • ssl.cipher.suites: A cipher suite is a named combination of authentication, encryption, MAC, and a key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. This list is empty by default.
  • ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1: Provide a list of SSL protocols that your brokers accept from clients.
  • ssl.keystore.type=JKS
  • ssl.truststore.type=JKS

Communication between Kafka brokers defaults to PLAINTEXT. To enable secured communication, modify the broker properties file by adding security.inter.broker.protocol=SSL.

For a list of the supported communication protocols, see Using Kafka’s Inter-Broker Security.

  Note: Due to import regulations in some countries, Oracle implementation of JCA limits the strength of cryptographic algorithms. If you need stronger algorithms, you must obtain the JCE Unlimited Strength Jurisdiction Policy Files and install them in the JDK/JRE as described in JCA Providers Documentation.

After SSL is configured your broker, logs should show an endpoint for SSL communication:

with addresses: PLAINTEXT -> EndPoint(192.168.1.1,9092,PLAINTEXT),SSL -> EndPoint(192.168.1.1,9093,SSL)

You can also check the SSL communication to the broker by running the following command:

openssl s_client -debug -connect localhost:9093 -tls1

This check can indicate that the server keystore and truststore are set up properly.

  Note: ssl.enabled.protocols should include TLSv1.

The output of this command should show the server certificate:

-----BEGIN CERTIFICATE-----
{variable sized random bytes}
-----END CERTIFICATE-----
subject=/C=US/ST=CA/L=Palo Alto/O=org/OU=org/CN=Franz Kafka
issuer=/C=US/ST=CA/L=Palo Alto
/O=org/OU=org/CN=kafka/emailAddress=kafka@your-domain.com

If the certificate does not appear, or if there are any other error messages, your keystore is not set up properly.

Step 5: Configuring Kafka Clients

SSL is supported only for the new Kafka producer and consumer APIs. The configurations for SSL are the same for both the producer and consumer.

If client authentication is not required in the broker, the following example shows a minimal configuration:

security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=SamplePassword123

If client authentication is required, a keystore must be created as in step 1, it needs to be signed by the CA as in step 3, and you must also configure the following properties:

ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=SamplePassword123
ssl.key.password=SamplePassword123

Other configuration settings might also be needed, depending on your requirements and the broker configuration:

  • ssl.provider (Optional). The name of the security provider used for SSL connections. Default is the default security provider of the JVM.

  • ssl.cipher.suites (Optional). A cipher suite is a named combination of authentication, encryption, MAC, and a key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol.

  • ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1. This property should list at least one of the protocols configured on the broker side.

  • ssl.truststore.type=JKS

  • ssl.keystore.type=JKS

Using Kafka’s Inter-Broker Security

Kafka can expose multiple communication endpoints, each supporting a different protocol. Supporting multiple communication endpoints enables you to use different communication protocols for client-to-broker communications and broker-to-broker communications. Set the Kafka inter-broker communication protocol using the security.inter.broker.protocol property. Use this property primarily for the following scenarios:

  • Enabling SSL encryption for client-broker communication but keeping broker-broker communication as PLAINTEXT. Because SSL has performance overhead, you might want to keep inter-broker communication as PLAINTEXT if your Kafka brokers are behind a firewall and not susceptible to network snooping.
  • Migrating from a non-secure Kafka configuration to a secure Kafka configuration without requiring downtime. Use a rolling restart and keep security.inter.broker.protocol set to a protocol that is supported by all brokers until all brokers are updated to support the new protocol.

    For example, if you have a Kafka cluster that needs to be configured to enable Kerberos without downtime, follow these steps:

    1. Set security.inter.broker.protocol to PLAINTEXT.
    2. Update the Kafka service configuration to enable Kerberos.
    3. Perform a rolling restart.
    4. Set security.inter.broker.protocol to SASL_PLAINTEXT.

Kafka 2.0 and higher supports the combinations of protocols listed here.

  SSL Kerberos
PLAINTEXT No No
SSL Yes No
SASL_PLAINTEXT No Yes
SASL_SSL Yes Yes

These protocols can be defined for broker-to-client interaction and for broker-to-broker interaction. The property security.inter.broker.protocol allows the broker-to-broker communication protocol to be different than the broker-to-client protocol, allowing rolling upgrades from non-secure to secure clusters. In most cases, set security.inter.broker.protocol to the protocol you are using for broker-to-client communication. Set security.inter.broker.protocol to a protocol different than the broker-to-client protocol only when you are performing a rolling upgrade from a non-secure to a secure Kafka cluster.

Enabling Kerberos Authentication

Apache Kafka supports Kerberos authentication, but it is supported only for the new Kafka Producer and Consumer APIs.

If you already have a Kerberos server, you can add Kafka to your current configuration. If you do not have a Kerberos server, install it before proceeding. See Enabling Kerberos Authentication for CDH.

If you already have configured the mapping from Kerberos principals to short names using the hadoop.security.auth_to_local HDFS configuration property, configure the same rules for Kafka by adding the sasl.kerberos.principal.to.local.rules property to the Advanced Configuration Snippet for Kafka Broker Advanced Configuration Snippet using Cloudera Manager. Specify the rules as a comma separated list.

To enable Kerberos authentication for Kafka:

  1. In Cloudera Manager, navigate to Kafka > Configuration.
  2. Set SSL Client Authentication to none.
  3. Set Inter Broker Protocol to SASL_PLAINTEXT.
  4. Click Save Changes.
  5. Restart the Kafka service (Action > Restart).
  6. Make sure that listeners = SASL_PLAINTEXT is present in the Kafka broker logs, by default in /var/log/kafka/server.log.
  7. Create a jaas.conf file with either cached credentials or keytabs.

    To use cached Kerberos credentials, where you use kinit first, use this configuration.

    KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true;
    };
    

    If you use a keytab, use this configuration. To generate keytabs, see Step 6: Get or Create a Kerberos Principal for Each User Account).

    KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=true
    keyTab="/etc/security/keytabs/mykafkaclient.keytab"
    principal="mykafkaclient/clients.hostname.com@EXAMPLE.COM";
    };
    
  8. Create the client.properties file containing the following properties.
    security.protocol=SASL_PLAINTEXT
    sasl.kerberos.service.name=kafka
    
  9. Test with the Kafka console producer and consumer.

    To obtain a Kerberos ticket-granting ticket (TGT):

    kinit user
  10. Verify that your topic exists.

    This does not use security features, but it is a best practice.

    kafka-topics --list --zookeeper zkhost:2181
  11. Verify that the jaas.conf file is used by setting the environment.
    export KAFKA_OPTS="-Djava.security.auth.login.config=/home/user/jaas.conf"
  12. Run a Kafka console producer.
    kafka-console-producer --broker-list anybroker:9092 --topic test1 --producer.config client.properties
  13. Run a Kafka console consumer.
    kafka-console-consumer --new-consumer --topic test1 --from-beginning --bootstrap-server anybroker:9092 --consumer.config client.properties

Enabling Encryption at Rest

Data encryption is increasingly recognized as an optimal method for protecting data at rest. You can encrypt Kafka data using Cloudera Navigator Encrypt.

Perform the following steps to encrypt Kafka data that is not in active use.

  1. Stop the Kafka service.
  2. Archive the Kafka data to an alternate location, using TAR or another archive tool.
  3. Unmount the affected drives.
  4. Install and configure Navigator Encrypt.

    See Installing Cloudera Navigator Encrypt.

  5. Expand the TAR archive into the encrypted directories.
  6. Restart the Kafka service.

Topic Authorization with Kerberos and Sentry

Apache Sentry includes a Kafka binding you can use to enable authorization in Kafka with Sentry. For more information, see Authorization With Apache Sentry.

Configuring Kafka to Use Sentry Authorization

The following steps describe how to configure Kafka to use Sentry authorization. These steps assume you have installed Kafka and Sentry on your cluster.

Sentry requires that your cluster include HDFS. After you install and start Sentry with the correct configuration, you can stop the HDFS service. For more information, see Installing and Upgrading the Sentry Service.

  Note: Cloudera's distribution of Kafka can make use of LDAP-based user groups when the LDAP directory is synchronized to Linux via tools such as SSSD. CDK does not support direct integration with LDAP, either through direct Kafka's LDAP authentication, or via Hadoop's group mapping (when hadoop.group.mapping is set to LdapGroupMapping). For more information, see Configuring LDAP Group Mappings.

To configure Sentry authentication for Kafka:

  1. In Cloudera Manager, go to Kafka > Configuration.
  2. Select Enable Kerberos Authentication.
  3. Select a Sentry service in the Kafka service configuration.
  4. Add superusers.

    Superusers can perform any action on any resource in the Kafka cluster. The kafka user is added as a superuser by default. Superuser requests are authorized without going through Sentry, which provides enhanced performance.

  5. Select Enable Sentry Privileges Caching to enhance performance.
  6. Restart the Sentry services.

Authorizable Resources

Authorizable resources are resources or entities in a Kafka cluster that require special permissions for a user to be able to perform actions on them. Kafka has four authorizable resources.

  • Cluster: controls who can perform cluster-level operations such as creating or deleting a topic. This resource can only have one value, kafka-cluster, as one Kafka cluster cannot have more than one cluster resource.
  • Topic: controls who can perform topic-level operations such as producing and consuming topics. Its value must match exactly the topic name in the Kafka cluster.

    With CDH 5.15.0 and CDK 3.1 and later, wildcards (*) can be used to refer to any topic in the privilege.

  • Consumergroup: controls who can perform consumergroup-level operations such as joining or describing a consumergroup. Its value must exactly match the group.id of a consumergroup.

    With CDH 5.14.1 and later, you can use a wildcard (*) to refer to any consumer groups in the privilege. This resource is useful when used with Spark Streaming, where a generated group.id may be needed.

  • Host: controls from where specific operations can be performed. Think of this as a way to achieve IP filtering in Kafka. You can set the value of this resource to the wildcard (*), which represents all hosts.
      Note: Only IP addresses should be specified in the host component of Kafka Sentry privileges, hostnames are not supported.

Authorized Actions

You can perform multiple actions on each resource. The following operations are supported by Kafka, though not all actions are valid on all resources.

  • ALL is a wildcard action, and represents all possible actions on a resource.
  • read
  • write
  • create
  • delete
  • alter
  • describe
  • clusteraction

Authorizing Privileges

Privileges define what actions are allowed on a resource. A privilege is represented as a string in Sentry. The following rules apply to a valid privilege.

  • Can have at most one Host resource. If you do not specify a Host resource in your privilege string, Host=* is assumed.
  • Must have exactly one non-Host resource.
  • Must have exactly one action specified at the end of the privilege string.

For example, the following are valid privilege strings:

Host=*->Topic=myTopic->action=ALL
Topic=test->action=ALL

Granting Privileges to a Role

The following examples grant privileges to the role test, so that users in testGroup can create a topic named testTopic and produce to it.

The user executing these commands must be added to the Sentry parameter sentry.service.allow.connect and also be a member of a group defined in sentry.service.admin.group.

Before you can assign the test role, you must first create it. To create the test role:

kafka-sentry -cr -r test

To confirm that the role was created, list the roles:

kafka-sentry -lr

If Sentry privileges caching is enabled, as recommended, the new privileges you assign take some time to appear in the system. The time is the time-to-live interval of the Sentry privileges cache, which is set using sentry.kafka.caching.ttl.ms. By default, this interval is 30 seconds. For test clusters, it is beneficial to have changes appear within the system as fast as possible, therefore, Cloudera recommends that you either use a lower time interval, or disable caching with sentry.kafka.caching.enable.

  1. Allow users in testGroup to write to testTopic from localhost, which allows users to produce to testTopic. Users need both write and describe permissions.
    kafka-sentry -gpr -r test -p "Host=127.0.0.1->Topic=testTopic->action=write"
    kafka-sentry -gpr -r test -p "Host=127.0.0.1->Topic=testTopic->action=describe"
  2. Assign the test role to the group testGroup:
    kafka-sentry -arg -r test -g testGroup
  3. Verify that the test role is part of the group testGroup:
    kafka-sentry -lr -g testGroup
  4. Create testTopic.
    $ kafka-topics --create --zookeeper localhost:2181 \
      --replication-factor 1 \
      --partitions 1 --topic testTopic
    kafka-topics --list --zookeeper localhost:2181 testTopic

Now you can produce to and consume from the Kafka cluster.

  1. Produce to testTopic.

    Note that you have to pass a configuration file, producer.properties, with information on JAAS configuration and other Kerberos authentication related information. See SASL Configuration for Kafka Clients.

    $ kafka-console-producer --broker-list localhost:9092 \
      --topic testTopic --producer.config producer.properties
      This is a message
      This is another message
    
  2. Grant the create privilege to the test role.
    $ kafka-sentry -gpr -r test -p
    "Host=127.0.0.1->Cluster=kafka-cluster->action=create"
  3. Allow users in testGroup to describe testTopic from localhost, which the user creates and uses.
    $ kafka-sentry -gpr -r test -p
    "Host=127.0.0.1->Topic=testTopic->action=describe"
  4. Grant the describe privilege to the test role.
    $ kafka-sentry -gpr -r test -p
    "Host=127.0.0.1->Consumergroup=testconsumergroup->action=describe"
  5. Allow users in testGroup to read from a consumer group, testconsumergroup, that it will start and join.
    $ kafka-sentry -gpr -r test -p
    "Host=127.0.0.1->Consumergroup=testconsumergroup->action=read"
  6. Allow users in testGroup to read from testTopic from localhost and to consume from testTopic.
    $ kafka-sentry -gpr -r test -p
    "Host=127.0.0.1->Topic=testTopic->action=read"
  7. Consume from testTopic.

    Note that you have to pass a configuration file, consumer.properties, with information on JAAS configuration and other Kerberos authentication related information. The configuration file must also specify group.id as testconsumergroup.

    kafka-console-consumer --new-consumer --topic testTopic \
      --from-beginning --bootstrap-server anybroker-host:9092 \
      --consumer.config consumer.properties
      This is a message
      This is another message

Troubleshooting Kafka with Sentry

If Kafka requests are failing due to authorization, the following steps can provide insight into the error:

  • Make sure you have run kinit as a user who has privileges to perform an operation.
  • Identify which broker is hosting the leader of the partition you are trying to produce to or consume from, as this leader is going to authorize your request against Sentry. One easy way of debugging is to just have one Kafka broker. Change log level of the Kafka broker by adding the following entry to the Kafka Broker in Logging Advanced Configuration Snippet (Safety Valve) and restart the broker:
    log4j.logger.org.apache.sentry=DEBUG

    Setting just Sentry to DEBUG mode avoids the debug output from undesired dependencies, such as Jetty.

  • Run the Kafka client or Kafka CLI with the required arguments and capture the Kafka log, which should be similar to:
    /var/log/kafka/kafka-broker-host-name.log
  • Look for the following information in the filtered logs:
    • Groups that the Kafka client user or CLI is running as.
    • Required privileges for the operation.
    • Retrieved privileges from Sentry.
    • Required and retrieved privileges comparison result.

This log information can provide insight into which privilege is not assigned to a user, causing a particular operation to fail.

Page generated August 29, 2019.