ProblemThis technote describes an issue that you may face with ICP logging. Show
You encounter anomalies with logstash, after few minutes one of your tree pods getting likeness error and continually restarts SymptomEvents: Warning Unhealthy 13s (x536 over 23h) kubelet, ... Liveness probe failed: HTTP probe failed with statuscode: 504 Diagnosing The ProblemWhen you delete and recreate helm on cluster it's working fine again for a few minutes.
Resolving The ProblemTo resolve this issue you will need to edit the logstash deployment to:
Document LocationWorldwide [{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSBS6K","label":"IBM Cloud Private"},"ARM Category":[{"code":"a8m0z0000001jMjAAI","label":"CommonServices->Logging"}],"ARM Case Number":"TS003878011","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB45","label":"Automation"}}] Product SynonymICP, IBM Cloud Private, Cloud Private, Common services, ICP Logging This page shows how to configure liveness, readiness and startup probes for containers. The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs. The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers. The kubelet uses startup probes to know when a container application has started. If such a probe is configured, it disables liveness and readiness checks until it succeeds, making sure those probes don't interfere with the application startup. This can be used to adopt liveness checks on slow starting containers, avoiding them getting killed by the kubelet before they are up and running. Before you beginYou need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Define a liveness commandMany applications running for long periods of time eventually transition to broken states, and cannot recover except by being restarted. Kubernetes provides liveness probes to detect and remedy such situations. In this exercise, you create a Pod that runs a container based on the
In the configuration file, you can see that the Pod has a single When the container starts, it executes this command:
For the first 30 seconds of the container's life, there is a Create the Pod:
Within 30 seconds, view the Pod events:
The output indicates that no liveness probes have failed yet:
After 35 seconds, view the Pod events again:
At the bottom of the output, there are messages indicating that the liveness probes have failed, and the failed containers have been killed and recreated.
Wait another 30 seconds, and verify that the container has been restarted:
The output shows that
Define a liveness HTTP requestAnother kind of liveness probe uses an HTTP GET request. Here is the configuration file for a Pod that runs a container based on the
In the configuration file, you can see that the Pod has a single container. The Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure. You can see the source code for the server in server.go. For the first 10 seconds that the container is alive, the
The kubelet starts performing health checks 3 seconds after the container starts. So the first couple of health checks will succeed. But after 10 seconds, the health checks will fail, and the kubelet will kill and restart the container. To try the HTTP liveness check, create a Pod:
After 10 seconds, view Pod events to verify that liveness probes have failed and the container has been restarted:
In releases prior to v1.13 (including v1.13), if the environment variable Define a TCP liveness probeA third type of liveness probe uses a TCP socket. With this configuration, the kubelet will attempt to open a socket to your container on the specified port. If it can establish a connection, the container is considered healthy, if it can't it is considered a failure.
As you can see, configuration for a TCP check is quite similar to an HTTP check. This example uses both readiness and liveness probes. The kubelet will send the first readiness probe 5 seconds after the container starts. This will attempt to connect to the In addition to the readiness probe, this configuration includes a liveness
probe. The kubelet will run the first liveness probe 15 seconds after the container starts. Similar to the readiness probe, this will attempt to connect to the To try the TCP liveness check, create a Pod:
After 15 seconds, view Pod events to verify that liveness probes:
Define a gRPC liveness probeFEATURE
STATE: If your application implements gRPC Health Checking Protocol, kubelet can be configured to use it for application liveness checks. You must enable the Here is an example manifest:
To use a gRPC probe, Configuration problems (for example: incorrect port and service, unimplemented health checking protocol) are considered a probe failure, similar to HTTP and TCP probes. To try the gRPC liveness check, create a Pod using the command below. In the example below, the etcd pod is configured to use gRPC liveness probe.
After 15 seconds, view Pod events to verify that the liveness check has not failed:
Before Kubernetes 1.23, gRPC health probes were often implemented using grpc-health-probe, as described in the blog post Health checking gRPC servers on Kubernetes. The built-in gRPC probes behavior is similar to one implemented by grpc-health-probe. When migrating from grpc-health-probe to built-in probes, remember the following differences:
Use a named portYou can use a named For example:
Protect slow starting containers with startup probesSometimes, you have to deal with legacy applications that might require an additional startup time on their first initialization. In such cases, it can be tricky to set up liveness probe parameters without compromising the fast response to deadlocks that motivated such a probe. The trick is to set up a startup probe with the
same command, HTTP or TCP check, with a So, the previous example would become:
Thanks to the startup probe, the application will have a maximum of 5 minutes (30 * 10 = 300s) to finish its startup. Once the startup probe has succeeded once, the liveness probe takes over to provide a fast response to container deadlocks. If the startup probe never succeeds, the container is killed after 300s and subject to the
pod's Define readiness probesSometimes, applications are temporarily unable to serve traffic. For example, an application might need to load large data or configuration files during startup, or depend on external services after startup. In such cases, you don't want to kill the application, but you don't want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services. Readiness probes are configured similarly to liveness probes. The only difference is that you use the
Configuration for HTTP and TCP readiness probes also remains identical to liveness probes. Readiness and liveness probes can be used in parallel for the same container. Using both can ensure that traffic does not reach a container that is not ready for it, and that containers are restarted when they fail. Configure ProbesProbes have a number of fields that you can use to more precisely control the behavior of liveness and readiness checks:
HTTP probesHTTP probes have additional fields that can be set on
For an HTTP probe, the kubelet sends an HTTP request to the
specified path and port to perform the check. The kubelet sends the probe to the pod's IP address, unless the address is overridden by the optional For an HTTP probe, the kubelet sends two request headers in addition to the mandatory You can override the default
headers by defining
You can also remove these two headers by defining them with an empty value.
TCP probesFor a TCP probe, the kubelet makes the probe connection at the node, not in the pod, which means that you can not use a service name in the Probe-level terminationGracePeriodSecondsFEATURE
STATE: Prior to release 1.21, the pod-level In 1.25 and beyond, users can specify a probe-level For example,
Probe-level What's next
You can also read the API references for:
How do I fix the readiness probe failed Kubernetes?To increase the readiness probe failure threshold, configure the Managed controller item and update the value of "Readiness Failure Threshold". By default, it is set to 100 (100 times). You may increase it to, for example, 300 .
What happens if readiness probe failed?failureThreshold : When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the container. In case of readiness probe the Pod will be marked Unready.
What is readiness probe?A readiness probe indicates whether applications running in a container are ready to receive traffic. If so, Services in Kubernetes can send traffic to the pod, and if not, the endpoint controller removes the pod from all services.
Why do liveness probes fail?The liveness probe will be marked as failed when the container issues an unhealthy response. The probe is also considered failed if the service doesn't implement the gRPC health checking protocol. Monitor the health of your cluster and troubleshoot issues faster with pre-built dashboards that just work.
|