This is part three of our four-part blog series on Google Kubernetes Engine (GKE) security. You can find the previous two parts below:
- GKE security best practices: designing secure clusters
- GKE networking best practices for security and operations
Adhering to security best practices for running your workloads on GKE plays a critical role in safeguarding your cluster and all its workloads. Misconfigured pods, for example, pose a huge danger if they are compromised. Follow our recommendations below to protect your GKE workloads at runtime.
Why: Kubernetes namespaces provide scoping for cluster objects, allowing fine-grained cluster object management. Kubernetes Role-based Access Control (RBAC) rules for most resource types apply at the namespace level. Controls like network policies and for many add-on tools and frameworks like service meshes also often apply at the namespace scope.
What to do: Plan out how you want to assign namespaces before you start deploying workloads to your clusters. Having one namespace per application provides the best opportunity for control, although it does bring extra management overhead when assigning RBAC role privileges and default network policies. If you do decide to group more than one application into a namespace, the main criteria should be whether those applications have common RBAC requirements and whether it would be safe to grant those privileges to the service accounts and users which need Kubernetes API access in that namespace.
Why: Kubernetes Role-Based Access Control provides the standard method for managing authorization for the Kubernetes API endpoints. The practice of creating and managing comprehensive RBAC roles that follow the principle of least privilege, in addition to performing regular audits of how those roles are delegated with role bindings, provides some of the most critical protections possible for your GKE clusters, both from external bad actors and internal misconfigurations and accidents.
What to do: Kubernetes RBAC is enabled by default in new GKE clusters. If you have existing clusters which currently do not have RBAC enabled, you will want to enable it. First, make sure you have created all the necessary RBAC resource objects for your cluster’s workloads and tested them in a non-production environment.
Configuring RBAC effectively and securely requires some understanding of the Kubernetes API. You can start with the official documentation, read about some best practices, and you may also want to work through some tutorials.
Once your team has solid working knowledge of RBAC, create some internal policies and guidelines. Make sure you also regularly audit your Role permissions and RoleBindings. Pay special attention to minimizing the use of ClusterRoles and ClusterRoleBindings, as these apply globally across all namespaces and to resources that do not support namespaces. (You can use the output of kubectl api-resources in your cluster to see which resources are not namespace-scoped.)
Why: Most containerized applications will not need any special host privileges on the node to function properly. By following the principle of least privilege and minimizing the capabilities of your cluster’s running containers, you can greatly reduce the level of exploitation for malicious containers and of accidental damage by misbehaving applications.
What to do: Use the PodSpec Security Context to define the exact runtime requirements for each workload. Use Pod Security Policies and/or admission controllers like Open Policy Agent (OPA) Gatekeeper to enforce those best practices by the Kubernetes API at object creation time.
- Do not allow containers to run as root. Running as root creates by far the greatest risk, because root in a container has root on the node.
- Do not use the host network or process space. Again, these settings create the potential for compromising the node and every container running on it.
- Do not allow privilege escalation.
- Use a read-only root filesystem in the container.
- Use the default (masked) /proc filesystem mount.
- Drop unused Linux capabilities and do not add optional capabilities that your application does not absolutely require. (Available capabilities depend on the container runtime in use on the nodes. GKE nodes can use either Docker or containerd, depending on the node image.)
- Use SELinux options for more fine-grained process controls.
- Give each application its own Kubernetes Service Account rather than sharing or using the namespace’s default service account.
- Do not mount the service account token in a container if the container does not need to access the Kubernetes API.
Why: Kubernetes Pod Security Policy provides a method to enforce best practices around minimizing container runtime privileges, including not running as the root user, not sharing the host node’s process or network space, not being able to access the host filesystem, enforcing SELinux, and other options. Most cluster workloads will not need special permissions. By forcing containers to use the least-required privilege, their potential for malicious exploitability or accidental damage can be minimized.
What to do: Enabled PSPs in your GKE cluster. Create policies which enforce the recommendations under Limit Container Runtime Privileges, shown above. Policies are best tested in a non-production environment running the same applications as your production cluster, after which you can deploy them in production.
Alternatively, because plans exist to deprecate Pod Security Policies (PSPs) in the future and because they only apply to a subset of runtime controls, consider deploying a configurable admission controller, described below.
Why: Kubernetes supports using admission controllers, which can be configured to evaluate requests to the Kubernetes API. In the case of validating controllers, an admission controller can deny requests that fail to meet certain requirements, while mutating controllers can make changes to the request, such as injecting a sidecar container into a pod or adding labels to an object, before sending it to the Kubernetes API.
One increasingly popular option to use for a validating admission controller is Open Policy Agent (OPA) Gatekeeper. The Gatekeeper admission controller uses custom Kubernetes resources to configure the requirements for Kubernetes resources. Users can create policies tailored to their needs and applications to enforce a variety of best practices by preventing non-conforming objects from getting created in a cluster. While some overlap of PSP capabilities exists, OPA allows restrictions not just on pods, but on any cluster resource using virtually any field.
What to do: You can write a custom admission controller to suit your specific needs, or install Gatekeeper or similar tool in your cluster. Note that while some example resources for enforcing common requirements in Gatekeeper exist, the policy configuration language and management come with a rather steep learning curve.
Note that Gatekeeper requires Kubernetes version 1.14 or higher.
Why: While most workloads in your GKE cluster will likely not require direct access to other GCP service APIs, some will, whether to use Cloud Storage, a database, or some other service. Managing these GCP IAM permissions on a per-workload basis using the principle of least privilege and with Workload Identity provides critical security controls to limit the ability of workloads in your cluster from gaining unauthorized access to your GCP project’s resources and services. Workload Identity allows you to associate a custom GCP service account in your GCP project with a Kubernetes service account in your GKE cluster.
What to do: Using Workload Identity effectively requires several steps, beginning with identifying which workloads, if any, need Cloud IAM permissions and exactly which permissions they need.
Why: Running containers are just processes on your node, the same as the kernel or OS system daemons that may also be running. Container runtimes rely on Linux cgroups to provide some controls and restrictions around these container processes, but they do not provide solid isolation security and could still allow malicious code or intruders to gain access to the node’s operating system kernel or other processes’ in-memory data.
GKE Sandbox can run pod containers in isolation from the node’s kernel. Sandbox uses gVisor, which re-implements the Linux kernel API in unprivileged userspace in the kernel, creating a \“sandbox\” which has limited direct access to the node’s kernel.
What to do: You can enable GKE Sandbox in node pools on new or existing GKE clusters.
Sandbox only supports Linux nodes and containers. Note that most, but not all applications can run in Sandbox.
Containers running in Sandbox may experience performance impact. If you do not want to run all your pods in Sandbox, you should prioritize using it for containers that may run untrusted, third-party code.