复制链接完成认证,获得“加急”处理问题的权限,方便您更快速地解决问题。
为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【概述】:tidb operator一直在想扩节点,但是不能完成。没找到办法停止。
【背景】:服务器掉电之后,自动扩展tikv,但是由于没pv导致无法扩展完成。掉电恢复之后,kv上线,但是扩展的节点一直在pending状态无法修复,于是手动修改statefulSet,减少kv数量,并且手动删除pending的po。
【问题】:
执行以下日志发现operator依然想缩容
kubectl logs tidb-controller-manager-67d596c978-28nhk -n tidb-admin
I0529 23:52:04.042789 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:05.722662 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:06.720871 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:08.444044 1 tikv_scaler.go:61] scaling out tikv statefulset push-namespace/push-tidb-tikv, ordinal: 5 (replicas: 6, delete slots: [])
I0529 23:52:08.444372 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:09.519014 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:10.113590 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:11.842568 1 tikv_scaler.go:61] scaling out tikv statefulset push-namespace/push-tidb-tikv, ordinal: 5 (replicas: 6, delete slots: [])
I0529 23:52:11.842706 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:12.482600 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:13.528035 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:15.242257 1 tikv_scaler.go:61] scaling out tikv statefulset push-namespace/push-tidb-tikv, ordinal: 5 (replicas: 6, delete slots: [])
I0529 23:52:15.242394 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:15.482759 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:16.915460 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:18.443082 1 tikv_scaler.go:61] scaling out tikv statefulset push-namespace/push-tidb-tikv, ordinal: 5 (replicas: 6, delete slots: [])
I0529 23:52:18.443308 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:18.684710 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:20.106858 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:21.642483 1 tikv_scaler.go:61] scaling out tikv statefulset push-namespace/push-tidb-tikv, ordinal: 5 (replicas: 6, delete slots: [])
I0529 23:52:21.642580 1 scaler.go:163] scale statefulset: push-namespace/push-tidb-tikv replicas from 5 to 6
I0529 23:52:21.883085 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
I0529 23:52:23.312042 1 tidbcluster_control.go:66] TidbCluster: [push-namespace/push-tidb] updated successfully
I0529 23:52:24.885513 1 tidbcluster_control.go:66] TidbCluster: [pay-back/pay-bk] updated successfully
tikv:
failureStores:
"736542":
createdAt: "2021-05-24T07:52:29Z"
podName: push-tidb-tikv-4
storeID: "736542"
image: harbor.fcbox.com/tidb/pingcap/tikv:v4.0.9
phase: Scale
statefulSet:
collisionCount: 0
currentReplicas: 5
currentRevision: push-tidb-tikv-8db8bc99d
observedGeneration: 22
readyReplicas: 5
replicas: 5
updateRevision: push-tidb-tikv-8db8bc99d
updatedReplicas: 5
stores:
"1":
id: "1"
ip: push-tidb-tikv-0.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-03T15:48:15Z"
lastTransitionTime: "2021-01-13T03:23:07Z"
leaderCount: 3458
podName: push-tidb-tikv-0
state: Up
"1362":
id: "1362"
ip: push-tidb-tikv-2.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-03T15:48:13Z"
lastTransitionTime: "2021-01-13T03:19:30Z"
leaderCount: 3463
podName: push-tidb-tikv-2
state: Up
"1363":
id: "1363"
ip: push-tidb-tikv-1.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-03T15:48:15Z"
lastTransitionTime: "2021-01-13T03:21:20Z"
leaderCount: 3467
podName: push-tidb-tikv-1
state: Up
"724701":
id: "724701"
ip: push-tidb-tikv-3.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-03T15:48:18Z"
lastTransitionTime: "2021-04-28T02:51:28Z"
leaderCount: 3461
podName: push-tidb-tikv-3
state: Up
"736542":
id: "736542"
ip: push-tidb-tikv-4.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-03T15:48:17Z"
lastTransitionTime: "2021-05-24T11:52:28Z"
leaderCount: 3468
podName: push-tidb-tikv-4
state: Up
synced: true
【业务影响】:无法调整配置
【TiDB 版本】:v4.0.9
【TiDB Operator 版本】:v1.1.4
【K8s 版本】:v1.18.8
【附件】:
Name: push-tidb-tikv
Namespace: push-namespace
CreationTimestamp: Thu, 24 Sep 2020 15:15:31 +0800
Selector: app.kubernetes.io/component=tikv,app.kubernetes.io/instance=push-tidb,app.kubernetes.io/managed-by=tidb-operator,app.kubernetes.io/name=tidb-cluster
Labels: app.kubernetes.io/component=tikv
app.kubernetes.io/instance=push-tidb
app.kubernetes.io/managed-by=tidb-operator
app.kubernetes.io/name=tidb-cluster
Annotations: pingcap.com/last-applied-configuration:
{"replicas":6,"selector":{"matchLabels":{"app.kubernetes.io/component":"tikv","app.kubernetes.io/instance":"push-tidb","app.kubernetes.io/...
Replicas: 5 desired | 5 total
Update Strategy: RollingUpdate
Partition: 6
Pods Status: 5 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app.kubernetes.io/component=tikv
app.kubernetes.io/instance=push-tidb
app.kubernetes.io/managed-by=tidb-operator
app.kubernetes.io/name=tidb-cluster
Annotations: prometheus.io/path: /metrics
prometheus.io/port: 20180
prometheus.io/scrape: true
Containers:
tikv:
Image: harbor.fcbox.com/tidb/pingcap/tikv:v4.0.9
Port: 20160/TCP
Host Port: 0/TCP
Command:
/bin/sh
/usr/local/bin/tikv_start_script.sh
Requests:
cpu: 8
memory: 45Gi
Environment:
NAMESPACE: (v1:metadata.namespace)
CLUSTER_NAME: push-tidb
HEADLESS_SERVICE_NAME: push-tidb-tikv-peer
CAPACITY: 0
TZ: Asia/Shanghai
Mounts:
/etc/podinfo from annotations (ro)
/etc/tikv from config (ro)
/usr/local/bin from startup-script (ro)
Containers:
tikv:
Image: harbor.fcbox.com/tidb/pingcap/tikv:v4.0.9
Port: 20160/TCP
Host Port: 0/TCP
Command:
/bin/sh
/usr/local/bin/tikv_start_script.sh
Requests:
cpu: 8
memory: 45Gi
Environment:
NAMESPACE: (v1:metadata.namespace)
CLUSTER_NAME: push-tidb
HEADLESS_SERVICE_NAME: push-tidb-tikv-peer
CAPACITY: 0
TZ: Asia/Shanghai
Mounts:
/etc/podinfo from annotations (ro)
/etc/tikv from config (ro)
/usr/local/bin from startup-script (ro)
/var/lib/tikv from tikv (rw)
Volumes:
annotations:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.annotations -> annotations
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: push-tidb-tikv
Optional: false
startup-script:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: push-tidb-tikv
Optional: false
Volume Claims:
Name: tikv
StorageClass: kv-storage
Labels: <none>
Annotations: <none>
Capacity: 50Gi
Access Modes: [ReadWriteOnce]
Events: <none>
在TC中的状态也很奇怪,为啥会判断push-tidb-tikv-4是挂的呢。
tikv:
failureStores:
"736542":
createdAt: "2021-05-24T07:52:29Z"
podName: push-tidb-tikv-4
storeID: "736542"
image: harbor.fcbox.com/tidb/pingcap/tikv:v4.0.9
phase: Scale
statefulSet:
collisionCount: 0
currentReplicas: 5
currentRevision: push-tidb-tikv-8db8bc99d
observedGeneration: 22
readyReplicas: 5
replicas: 5
updateRevision: push-tidb-tikv-8db8bc99d
updatedReplicas: 5
stores:
"1":
id: "1"
ip: push-tidb-tikv-0.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-04T03:27:41Z"
lastTransitionTime: "2021-01-13T03:23:07Z"
leaderCount: 3571
podName: push-tidb-tikv-0
state: Up
"1362":
id: "1362"
ip: push-tidb-tikv-2.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-04T03:27:38Z"
lastTransitionTime: "2021-01-13T03:19:30Z"
leaderCount: 3571
podName: push-tidb-tikv-2
state: Up
"1363":
id: "1363"
ip: push-tidb-tikv-1.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-04T03:27:40Z"
lastTransitionTime: "2021-01-13T03:21:20Z"
leaderCount: 3570
podName: push-tidb-tikv-1
state: Up
"724701":
id: "724701"
ip: push-tidb-tikv-3.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-04T03:27:43Z"
lastTransitionTime: "2021-04-28T02:51:28Z"
leaderCount: 3561
podName: push-tidb-tikv-3
state: Up
"736542":
id: "736542"
ip: push-tidb-tikv-4.push-tidb-tikv-peer.push-namespace.svc
lastHeartbeatTime: "2021-06-04T03:27:43Z"
lastTransitionTime: "2021-05-24T11:52:28Z"
leaderCount: 3569
podName: push-tidb-tikv-4
state: Up