Common Questions about Monitoring

1. Unable to collect monitoring information for `kube-scheduler`

The monitoring component cannot collect monitoring information for the kube-scheduler. The reason may be that Prometheus is configured to collect /metrics through the HTTPS port (10259), but the kube-scheduler does not support exposing the /metrics interface through the HTTPS port.

Solution: Log in to the three master nodes in turn, modify the /usr/lib/systemd/system/kube-scheduler.service file, and add two parameters: --authentication-kubeconfig and --authorization-kubeconfig.


[Service]
EnvironmentFile=-/etc/kubernetes/config
ExecStart=/usr/local/bin/kube-scheduler \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            --config=/etc/kubernetes/kube-scheduler.conf \
            --authentication-kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
            --authorization-kubeconfig=/etc/kubernetes/kubelet.kubeconfig

Then execute systemctl restart kube-scheduler.

2. Unable to collect monitoring information for `kube-scheduler`

Use the command to check the restart reason of node-exporter: Determine whether the field status.containerStatuses.lastState.terminated.reason is “OOM”. If yes, resource adjustment is required.


## Replace <pod-name> with the actual pod name queried (e.g., uk8s-monitor-prometheus-node-exporter-sbncp):
$ kubectl -n uk8s-monitor get po <pod-name> -o yaml

Adjust the resources of node-exporter (modify cpu and memory in the command to the desired values):


$ kubectl -n uk8s-monitor set resources daemonset uk8s-monitor-prometheus-node-exporter  --limits=cpu=500m,memory=1Gi --requests=cpu=250m,memory=512Mi

Verify the modification success by checking if the daemonset resources are updated to the new values:


$ kubectl -n uk8s-monitor get daemonset uk8s-monitor-prometheus-node-exporter -o yaml

3. Monitor Storage Expansion

Expand PVC

The block storage used by Prometheus deployed via the console supports online expansion. Use the following command to check if block storage is being used:


$ kubectl -n uk8s-monitor get pvc | grep "prometheus-uk8s-prometheus-0" |grep "csi-udisk"

prometheus-uk8s-prometheus-db-prometheus-uk8s-prometheus-0           Bound    pvc-1584d2af-4f12-476d-abc1-0a4711feca2e   100Gi      RWO            ssd-csi-udisk   9m3s

Use the following command to edit the storage size:


$ kubectl -n uk8s-monitor edit pvc prometheus-uk8s-prometheus-db-prometheus-uk8s-prometheus-0

Then modify the spec.resources.requests.storage field in the PVC configuration to a larger value.


spec:
  resources:
    requests:
      storage: 200Gi  # Change to your desired size, must be larger than the original

After the expansion, check the PVC status with:


$ kubectl -n uk8s-monitor get pvc prometheus-uk8s-prometheus-db-prometheus-uk8s-prometheus-0 -o yaml

Verify the status.capacity.storage field reflects the new size.

Adjust Data Retention Size

To retain more historical data after expansion, modify the retentionSize parameter in the Prometheus CR (uk8s-prometheus):

Edit the CR: uk8s-prometheus


kubectl -n uk8s-monitor edit prometheus uk8s-prometheus

Modify the spec.retentionSize field to a higher value:


spec:
  retentionSize: 150GB

Verification

⚠️ If “no space left on device” errors existed in monitoring logs before expansion, restart all Prometheus pods to ensure data recovery.

Delete pods one by one and wait for each to fully recover before deleting the next:


$ kubectl -n uk8s-monitor delete po prometheus-uk8s-prometheus-0

# Check the monitoring logs to verify if the monitoring service has started successfully.
$ kubectl -n uk8s-monitor logs -f prometheus-uk8s-prometheus-0 

...
msg="Starting Prometheus" version="(version=2.18.2, branch=HEAD, revision=a6600f564e3c483cc820bae6c7a551db701a22b3)"
...
msg="Starting TSDB ..."
....
msg="TSDB started"
...
msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
...
msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
...
msg="Server is ready to receive web requests."
...

Finally, verify monitoring data is displayed correctly in the dashboard.

Prometheus CPU and Memory Expansion

If you encounter resource limitations while using Prometheus, you can adjust the resources as follows:

Edit the Prometheus object uk8s-prometheus using the command:


$ kubectl -n uk8s-monitor edit prometheus uk8s-prometheus

Then modify the spec.resources field in the configuration to larger values. Example:


spec:
  resources:
    limits:
      cpu: 1000m      # Adjust CPU according to your needs
      memory: 2048Mi  # Adjust memory according to your needs
    requests:
      cpu: 1000m      # Adjust CPU request
      memory: 2048Mi  # Adjust memory request

After modification, the corresponding Prometheus pod will restart. Check if the pod’s resources have been updated to the new values:


# Replace <pod-name> with the actual pod name (e.g., prometheus-uk8s-prometheus-0)
$ kubectl -n uk8s-monitor get po <pod-name> -o yaml