Azure Kubernetes (AKS) Security Best Practices Part 3 of 4: Runtime Security

Welcome to part three of our four-part series on best practices and recommendations for Azure Kubernetes Service (AKS) cluster security. Previous posts have discussed how to plan and create secure AKS clusters and container images, and how to lock down AKS cluster networking infrastructure. This post will cover the critical topic of securing the application runtimes for AKS cluster workloads, and the tools and controls available to help enforce best practices in multi-tenant AKS clusters.

Runtime Security for Workloads

The security health of your AKS clusters depends just as much on the workloads deployed to the cluster as on the secure configuration of the AKS infrastructure. This section covers how to plan and deploy your applications to keep your nodes, network, cluster, and applications safe.

Namespaces!

Why: Kubernetes namespaces provide crucial scope for cluster objects, allowing for fine-grained cluster object management. Kubernetes RBAC rules for most resource types apply at the namespace level. Controls like network policies and many add-on tools and frameworks like service meshes also often apply at the namespace scope.

What to do: Plan out how you want to assign namespaces before you start deploying workloads to your clusters. Having one namespace per application provides the most opportunity for control, although it does bring extra management overhead when assigning RBAC role privileges and default network policies. If you do decide to group more than one application into a namespace, the main consideration should be whether those applications have common RBAC requirements and whether it would be safe to grant those privileges to the service accounts and users who need Kubernetes API access in that namespace.

Kubernetes RBAC

Why: Kubernetes Role-Based Access Control provides the standard method for managing authorization for the Kubernetes API endpoints. The practice of creating and managing comprehensive RBAC roles that follow the principle of least privilege, in addition to performing regular audits of how those roles are delegated with role bindings, provides some of the most critical protections possible for your AKS clusters, both from external bad actors and internal misconfigurations and accidents.

What to do: Configuring Kubernetes RBAC effectively and securely requires some understanding of the Kubernetes API. You can start with the official documentation, read about some best practices, and you may also want to work through some tutorials.

Once your team has solid working knowledge of RBAC, create some internal policies and guidelines. Make sure you also regularly audit your Role permissions and RoleBindings. Pay special attention to minimizing the use of ClusterRoles and ClusterRoleBindings, as these apply globally across all namespaces and to resources that do not support namespaces. (You can use the output of kubectl api-resources in your cluster to see which resources are not namespace-scoped.)

Azure also supports using Azure Active Directory with your AKS cluster RBAC for user authentication and authorization management.

Use Kubernetes Network Policies

Why: By default, all pods in a Kubernetes cluster can make network connections to any listening ports in containers in other pods. The Kubernetes Network Policy API provides the ability to create network firewall rules for the ingress and egress traffic for your cluster’s pods. Limiting access to pods based on your distributed applications’ requirements reduces the potential for damage if a pod in the cluster becomes compromised by a malicious agent.

What to do: As noted in the Cluster Design section, a network policy option must be selected at cluster creation time.

See our posts on guidelines for writing ingress and egress network policies.

Limit Container Runtime Privileges

Why: Most containerized applications will not need any special host privileges on the node to function properly. By following the principle of least privilege and minimizing the capabilities of your cluster’s running containers, you can greatly reduce the risk of exploitability or accidental damage by misbehaving applications.

What to do: Use the PodSpec Security Context to define the exact runtime requirements for each workload. Use Pod Security Policies and/or admission controllers like Open Policy Agent (OPA) Gatekeeper to enforce those best practices by the Kubernetes API at object creation time.

Some guidelines:

Do not run application processes as root.
Do not allow privilege escalation.
Use a read-only root filesystem.
Use the default (masked) /proc filesystem mount.
Do not use the host network or process space.
Drop unused Linux capabilities and do not add optional capabilities that your application does not absolutely require. (The available capabilities depend on the container runtime in use. AKS uses the Docker/Moby runtime, which supports these capabilities. The first table lists capabilities loaded by default, while the second table shows optional capabilities that may be added.)
Use SELinux options for more fine-grained process controls.
Give each application its own Kubernetes Service Account.
Do not mount the service account credentials in a container if it does not need to access the Kubernetes API.

Use Pod Security Policies

Note that AKS support for Kubernetes Pod Security Policy is currently in preview.

Why: Kubernetes Pod Security Policy provides a method to enforce best practices around minimizing container runtime privileges, including not running as the root user, not sharing the host node’s process or network space, not being able to access the host filesystem, enforcing SELinux, and other options. Most cluster workloads will not need special permissions and by forcing containers to use the least-required privilege, their potential for malicious exploitability or accidental damage can be minimized.

What to do: Follow the instructions for enabling Pod Security Policy in your AKS cluster. Make sure you test your PSPs in a non-production cluster before enabling them in production. Pod Security Policy support can be enabled in an existing AKS cluster at any time.

Use an Admission Controller to Enforce Best Practices

Note that Azure Policy for AKS is currently in preview.

Why: Kubernetes supports using admission controllers, which can be configured to evaluate requests to the Kubernetes API. In the case of validating controllers, an admission controller can deny requests that fail to meet certain requirements, or for mutating controllers, make changes to the request, such as injecting a sidecar container to a pod or adding labels to an object, before sending it to the Kubernetes API.

While users can deploy their own admission controllers to perform a variety of tasks to AKS clusters, AKS also now offers an integration with Azure Policy, Azure’s governance service. Azure Policy for AKS uses the Open Policy Agent (OPA) Gatekeeper admission controller to enforce a variety of best practices by preventing non-conforming objects from getting created in an AKS cluster. While some overlap of Pod Security Policy capabilities exists, OPA allows restrictions not just on pods, but on virtually any attribute of any cluster resource.

What to do: Follow these instructions for enabling Azure Policy for AKS in your cluster.

At least for now, Azure Policy for AKS only supports using AKS-supplied policy definitions. If you want to leverage the full power of OPA, you can install and manage the Gatekeeper admission controller yourself.

Up next, in the final post in this series, we will cover the user maintenance and operational tasks required to keep AKS clusters secure and healthy.