In Chapter 9, we covered how cluster services provide authentication, authorization, and confidentiality. These security mechanisms rely heavily on a common understanding between clients, services, and operating systems of which users and groups exist. Cluster architects need to be familiar with how cluster services use identity services for authentication and authorization and what providers are available, in order to decide how best to configure the clusters within the enterprise context. In this chapter, we examine these interactions and outline some common integration architectures.
We need identity management providers in the following areas:
As we have seen, integration with a KDC is essential to secure authentication in most Hadoop services. Every user wishing to use the cluster must have a principal in one of the trusted realms, and ideally this principal maps to an existing enterprise user account with the same password. Each server in the cluster must be configured to allow users and servers to authenticate to a KDC.
Cluster services will use users and groups when making authentication and authorization decisions and for execution. For example, YARN requires that users exist on every node, to ensure security isolation between running jobs. We therefore need a way of resolving enterprise user accounts on each cluster node, and furthermore these need to correspond ...