Kubernetes is a portable, extensible open-source framework built from a host of loosely connected components. Its main advantage is the ability to run distributed systems extremely resiliently. Moreover, it allows for easy deployment, scaling, monitoring and detailed logging through Kubernetes audit logs.
A rising number of modern software applications contain within them a host of discrete services. Each of these services is packed in a separate container. Kubernetes manages the resources in these containers at scale while maintaining security. It is arguably the largest container platform manager in the world. Although Kubernetes is not without vulnerabilities, understanding and leveraging the data in Kubernetes logs can bring value to your organization on multiple fronts.
When it comes to digital security, the best way to determine if a system was compromised or abused is to analyze the activity logs. It’s essentially a form of digital forensics to determine which actions were made by the system’s admins, users and various automated services. This allows you to answer four crucial questions:
- What happened to your system and when?
- Who initiated the action and from where?
- On what part of the system did it happen and where was it identified?
- Where was the action going?
The Growing Challenges of Kubernetes
As popular and widely used as it may be, Kubernetes still comes with many risks which are of great significance for digital security teams worldwide.
Security teams need to be able to rapidly identify which users (and roles) have had access to sensitive resources at any given time. They have to determine if these resources were used maliciously or in error. In addition, it is important to find out which tools were seeking to access them. Some of the most common problems related to Kubernetes are:
- Hacked credentials – allowing unauthorized personnel to access Kubernetes pods and clusters
- Stolen tokens and poorly configured access control – risking lateral cluster movement and unauthorized data access
- Exploitable Kubernetes API server vulnerabilities – other users gaining access to various sensitives resources
- Timing – security teams often analyze breaches days or weeks after the breach has been made.
Any of these problems can cause significant damage to your software or infrastructure. The upside is that Kubernetes audit logs will track every action potential attackers undertake. This gives your DevSecOps the chance to limit the damage and solve issues quickly.
What Are Kubernetes Audit Logs?
Kubernetes audit logs allow you to find detailed information about each call made by any user class to the Kubernetes API server.
The Kubernetes audit logs contain data regarding each of the following steps:
- Authentication – the API server establishes the identity of the user associated with the request, using several authentication mechanisms.
- Authorization – the server determines if the previously recognized identity which initiated the request has the necessary permission to access the verb-HTTP combination.
- Control of admission – assesses if the request is well-structured and might modify it before processing it.
- Validation – ensures the validity of specific resources in a request.
- Execution – if all previous steps pass, the operation will be performed. This includes creating, deleting, listing and monitoring of specific resources and resources clusters, opening remote shells and data streams, etc.
Audit logs also contain various metadata. This includes key attributes such as the HTTP method, requested URL and information about the user who initiated the request. This is essential for security teams and infrastructure ops to determine the chain of events that impacted the system.
You can configure the API server to store a part of these requests or even all of them. Additionally, the level of details in each of the requests can be tuned to meet the admin’s requirements. The amount of details and their complexity are almost universally defined in an audit policy. The audit policy provides a list of rules in order, and each processed event is compared against these rules. The policy usually also specifies where these logs are stored. Based on that, various analysis tools can be deployed to harvest and process these logs.
Advantages of Monitoring the Kubernetes Audit Logs
Audit logs allow you to capture entire events and create, delete or update different actions based on what happened earlier. However, the default Kubernetes event storage retention time is just one hour. This limits your option to go back and comb through the details of an event that took place a while back. Luckily, you can commit your audit logs to long term retention. Storing them gives you a recapitulation and reaction buffer in case something important happens. Longer storage time can help you answer key questions such as which lifecycle ops take place when you update your deployment, for instance.
Bear in mind that these logs are fairly complex and, in some cases, extremely extensive. The most efficient way to monitor your Kubernetes logs is to create an automated scanning system. The system would then continuously scan the logs and alert admins if there’s a potential security breach. There’s a rising trend in employing machine learning algorithms to strengthen the scanning tools and help them identify multi-step attacks. This dramatically shortens the time between diagnosing the breach and fixing it.
You can set up automated alerts and notifications to help you be in the loop in real-time. For example, your security team can deploy monitors to detect surges in 401 Unauthorized 403 Forbidden response codes. This also includes anonymous API server calls which your Kubernetes audit logs will show.
Finally, when it comes to policies, you’ll want to focus on relevance. This means understanding what to log and when to log it. Creating a policy requires gathering enough reliable data to act on. Tools such as Datadog can give you plenty of utilities to help shape your policy. Once you have a robust policy in place, it’ll be much easier to focus on up-scaling and upgrading.