pd创建后无法广播,后续的tikv和tidb无法创建

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v4.0.8
  • 【问题描述】:pd创建后无法广播(“failed to publish local member to cluster through raft”),tidb-control-manager无法同步TiDBCluster(“failed to sync TidbCluster”),后续的tikv和tidb无法创建

集群中目前有6个节点,计划运行1个PD,3个TiKV,1个TiDB

[root@master tidb-cluster]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master.kscn Ready master 37m v1.19.4 10.38.163.179 CentOS Linux 7 (Core) 3.10.0-693.21.1.std7a.el7.0.x86_64 docker://19.3.13
follower1.kscn Ready 33m v1.19.4 10.38.163.136 CentOS Linux 7 (Core) 3.10.0-1127.19.1.el7.x86_64 docker://19.3.11
follower2.kscn Ready 33m v1.19.4 10.38.163.250 CentOS Linux 7 (Core) 3.10.0-1127.19.1.el7.x86_64 docker://19.3.11
follower3.kscn Ready 33m v1.19.4 10.38.163.198 CentOS Linux 7 (Core) 3.10.0-1127.19.1.el7.x86_64 docker://19.3.13
follower4.kscn Ready 32m v1.19.4 10.38.163.170 CentOS Linux 7 (Core) 3.10.0-1127.19.1.el7.x86_64 docker://19.3.11
follower5.kscn Ready 32m v1.19.4 10.38.163.68 CentOS Linux 7 (Core) 3.10.0-1127.19.1.el7.x86_64 docker://19.3.11

PV使用local pv

[root@master ~]# kubectl get pv -o wide
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
local-pv-2b2d3403 98Gi RWO Delete Available local-storage 24m Filesystem
local-pv-897f2516 98Gi RWO Delete Available local-storage 24m Filesystem
local-pv-949fb3ba 98Gi RWO Delete Available local-storage 24m Filesystem
local-pv-94b5324e 98Gi RWO Delete Available local-storage 24m Filesystem
local-pv-9a7ad128 98Gi RWO Retain Bound k8s-staging-local-pv/pd-k8s-staging-local-pv-pd-0 local-storage 24m Filesystem

[root@master ~]# kubectl get pvc --all-namespaces -o wide
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
k8s-staging-local-pv pd-k8s-staging-local-pv-pd-0 Bound local-pv-9a7ad128 98Gi RWO local-storage 12m Filesystem

PD已经启动,但是后续的TiKV和TiDB并没有启动

[root@master ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
k8s-staging-local-pv k8s-staging-local-pv-discovery-59db44bd54-qddq8 1/1 Running 0 11m 10.244.1.5 follower1.kscn
k8s-staging-local-pv k8s-staging-local-pv-monitor-57dbcd8f4f-2k99x 3/3 Running 0 11m 10.244.4.5 follower4.kscn
k8s-staging-local-pv k8s-staging-local-pv-pd-0 1/1 Running 2 11m 10.244.2.4 follower2.kscn
kube-system coredns-f9fd979d6-56c9c 1/1 Running 0 29m 10.244.0.2 master.kscn
kube-system coredns-f9fd979d6-l7knp 1/1 Running 0 29m 10.244.0.3 master.kscn
kube-system etcd-master.kscn 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system kube-apiserver-master.kscn 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system kube-controller-manager-master.kscn 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system kube-flannel-ds-5zmlt 1/1 Running 0 25m 10.38.163.250 follower2.kscn
kube-system kube-flannel-ds-bdh9k 1/1 Running 0 25m 10.38.163.198 follower3.kscn
kube-system kube-flannel-ds-gf8cm 1/1 Running 0 25m 10.38.163.136 follower1.kscn
kube-system kube-flannel-ds-jbq7j 1/1 Running 0 25m 10.38.163.170 follower4.kscn
kube-system kube-flannel-ds-mhk22 1/1 Running 0 24m 10.38.163.68 follower5.kscn
kube-system kube-flannel-ds-r7tf2 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system kube-proxy-c496h 1/1 Running 0 25m 10.38.163.136 follower1.kscn
kube-system kube-proxy-cdqgx 1/1 Running 0 25m 10.38.163.250 follower2.kscn
kube-system kube-proxy-fqw8c 1/1 Running 0 25m 10.38.163.198 follower3.kscn
kube-system kube-proxy-h668t 1/1 Running 0 24m 10.38.163.68 follower5.kscn
kube-system kube-proxy-jw2vd 1/1 Running 0 25m 10.38.163.170 follower4.kscn
kube-system kube-proxy-wp55t 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system kube-scheduler-master.kscn 1/1 Running 0 29m 10.38.163.179 master.kscn
kube-system local-volume-provisioner-dn2px 1/1 Running 0 24m 10.244.3.2 follower3.kscn
kube-system local-volume-provisioner-h8757 1/1 Running 0 24m 10.244.4.2 follower4.kscn
kube-system local-volume-provisioner-kxw5g 1/1 Running 0 24m 10.244.5.2 follower5.kscn
kube-system local-volume-provisioner-pvcvp 1/1 Running 0 24m 10.244.1.2 follower1.kscn
kube-system local-volume-provisioner-wtkp8 1/1 Running 0 24m 10.244.2.2 follower2.kscn
kube-system tiller-deploy-7b56c8dfb7-hhfth 1/1 Running 0 21m 10.244.2.3 follower2.kscn
tidb-admin tidb-controller-manager-85ffcb7557-w6wgb 1/1 Running 0 11m 10.244.1.4 follower1.kscn
tidb-admin tidb-scheduler-7bb75dcb4c-2j9gj 2/2 Running 0 11m 10.244.3.4 follower3.kscn

查看PD的log

[root@master ~]# tail -n 20 log
[2020/11/17 10:18:19.667 +00:00] [INFO] [raft.go:729] [“4d3c70dd860f340 became pre-candidate at term 138”]
[2020/11/17 10:18:19.667 +00:00] [INFO] [raft.go:824] [“4d3c70dd860f340 received MsgPreVoteResp from 4d3c70dd860f340 at term 138”]
[2020/11/17 10:18:19.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to 83619ec08faef07b at term 138”]
[2020/11/17 10:18:19.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to a587e9700c61a843 at term 138”]
[2020/11/17 10:18:22.667 +00:00] [INFO] [raft.go:923] [“4d3c70dd860f340 is starting a new election at term 138”]
[2020/11/17 10:18:22.667 +00:00] [INFO] [raft.go:729] [“4d3c70dd860f340 became pre-candidate at term 138”]
[2020/11/17 10:18:22.667 +00:00] [INFO] [raft.go:824] [“4d3c70dd860f340 received MsgPreVoteResp from 4d3c70dd860f340 at term 138”]
[2020/11/17 10:18:22.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to 83619ec08faef07b at term 138”]
[2020/11/17 10:18:22.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to a587e9700c61a843 at term 138”]
[2020/11/17 10:18:23.683 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=83619ec08faef07b] [rtt=0s] [error=“dial tcp: lookup k8s-staging-local-pv-pd-2.k8s-staging-local-pv-pd-peer.k8s-staging-local-pv.svc on 10.96.0.10:53: no such host”]
[2020/11/17 10:18:23.683 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=83619ec08faef07b] [rtt=0s] [error=“dial tcp:
lookup k8s-staging-local-pv-pd-2.k8s-staging-local-pv-pd-peer.k8s-staging-local-pv.svc on 10.96.0.10:53: no such host”]
[2020/11/17 10:18:23.683 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=a587e9700c61a843] [rtt=0s] [error=“dial tcp: lookup k8s-staging-local-pv-pd-1.k8s-staging-local-pv-pd-peer.k8s-staging-local-pv.svc on 10.96.0.10:53: no such host”]
[2020/11/17 10:18:23.683 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=a587e9700c61a843] [rtt=0s] [error=“dial tcp:
lookup k8s-staging-local-pv-pd-1.k8s-staging-local-pv-pd-peer.k8s-staging-local-pv.svc on 10.96.0.10:53: no such host”]
[2020/11/17 10:18:25.667 +00:00] [INFO] [raft.go:923] [“4d3c70dd860f340 is starting a new election at term 138”]
[2020/11/17 10:18:25.667 +00:00] [INFO] [raft.go:729] [“4d3c70dd860f340 became pre-candidate at term 138”]
[2020/11/17 10:18:25.667 +00:00] [INFO] [raft.go:824] [“4d3c70dd860f340 received MsgPreVoteResp from 4d3c70dd860f340 at term 138”]
[2020/11/17 10:18:25.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to 83619ec08faef07b at term 138”]
[2020/11/17 10:18:25.667 +00:00] [INFO] [raft.go:811] [“4d3c70dd860f340 [logterm: 5, index: 1140] sent MsgPreVote request to a587e9700c61a843 at term 138”]
[2020/11/17 10:18:25.677 +00:00] [WARN] [server.go:2045] [“failed to publish local member to cluster through raft”] [local-member-id=4d3c70dd860f340] [local-member-attributes=“{Name:k8s-staging-local-pv-pd-0 ClientURLs:[http://k8s-staging-local-pv-pd-0.k8s-staging-local-pv-pd-peer.k8s-staging-local-pv.svc:2379]}”] [request-path=/0/members/4d3c70dd860f340/attributes] [publish-timeout=11s] [error=“etcdserver: request timed out”]
[2020/11/17 10:18:28.663 +00:00] [FATAL] [main.go:120] [“run server failed”] [error=“[PD:server:ErrCancelStartEtcd]etcd start canceled”] [stack=“github.com/pingcap/log.Fatal
\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/pkg/mod/github.com/pingcap/log@v0.0.0-20200511115504-543df19646ad/global.go:59
main.main
\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.8/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:120
runtime.main
\t/usr/local/go/src/runtime/proc.go:203”]

查看tidb-controller-manager的log

[root@master ~]# tail -n 20 log-controllermanager
E1117 10:02:57.279363 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: dial tcp 10.103.83.254:2379: connect: connection refused
E1117 10:02:57.279439 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:02:57.606836 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: dial tcp 10.103.83.254:2379: connect: connection refused
E1117 10:02:57.606971 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:02:58.146065 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: dial tcp 10.103.83.254:2379: connect: connection refused
E1117 10:02:58.146187 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:02:58.748620 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: dial tcp 10.103.83.254:2379: connect: connection refused
E1117 10:02:58.748737 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:03:01.355901 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: dial tcp 10.103.83.254:2379: connect: connection refused
E1117 10:03:01.356015 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:03:11.526368 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1117 10:03:11.526502 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:03:26.826740 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1117 10:03:26.826909 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:03:52.363518 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1117 10:03:52.363660 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:04:38.371342 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1117 10:04:38.371486 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing
E1117 10:06:05.341224 1 pd_member_manager.go:183] failed to sync TidbCluster: [k8s-staging-local-pv/k8s-staging-local-pv]'s status, error: Get http://k8s-staging-local-pv-pd.k8s-staging-local-pv:2379/pd/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1117 10:06:05.341360 1 tidb_cluster_controller.go:123] TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv, sync failed TidbCluster: k8s-staging-local-pv/k8s-staging-local-pv’s pd status sync failed, can’t failover, requeuing

TiDB集群使用Helm安装(https://book.tidb.io/session2/chapter1/tidb-operator-deployment-private-tidb.html),使用的具体values如下
[root@master tidb-cluster]# cat values.yaml
rbac:
create: true
crossNamespace: false
extraLabels: {}
schedulerName: tidb-scheduler
timezone: UTC
pvReclaimPolicy: Retain
enablePVReclaim: false
services:

  • name: pd
    type: ClusterIP
    discovery:
    image: pingcap/tidb-operator:v1.1.7
    imagePullPolicy: IfNotPresent
    resources:
    limits:
    cpu: 250m
    memory: 150Mi
    requests:
    cpu: 80m
    memory: 50Mi
    affinity: {}
    tolerations:
    enableConfigMapRollout: true
    haTopologyKey: kubernetes.io/hostname
    tlsCluster:
    enabled: false

pd:
config: |
[log]
level = “info”
[replication]
location-labels = [“region”, “zone”, “rack”, “host”]
service: {}
replicas: 1
image: pingcap/pd:v4.0.8
storageClassName: local-storage
imagePullPolicy: IfNotPresent

resources:
limits: {}
#cpu: 8000m
#memory: 8Gi
requests:
cpu: 2000m
memory: 2Gi
storage: 1Gi
affinity: {}
nodeSelector: {}
tolerations:
annotations: {}
hostNetwork: false
podSecurityContext: {}
priorityClassName: “”

tikv:
config: |
log-level = “info”
[readpool.coprocessor]
high-concurrency = 3
normal-concurrency = 3
low-concurrency = 3
[storage.block-cache]
shared = true
capacity = “8GB”
replicas: 3
image: pingcap/tikv:v4.0.8
storageClassName: local-storage
imagePullPolicy: IfNotPresent

resources:
limits:
cpu: 4000m
memory: 16Gi
storage: 90Gi # We can set capacity here.
requests:
cpu: 2000m
memory: 12Gi
storage: 80Gi
affinity: {}
nodeSelector: {}
tolerations:
annotations: {}
hostNetwork: false
podSecurityContext: {}
priorityClassName: “”
maxFailoverCount: 3
postArgScript: |
if [ ! -z “${STORE_LABELS:-}” ]; then
LABELS=" --labels ${STORE_LABELS} "
ARGS=“${ARGS}${LABELS}”
fi

tidb:
config: |
[log]
level = “info”
[performance]
max-procs = 4

replicas: 1
image: pingcap/tidb:v4.0.8
imagePullPolicy: IfNotPresent

resources:
limits: {}
#cpu: 4000m
#memory: 16Gi
requests: {}
#cpu: 3000m
#memory: 12Gi
affinity: {}
nodeSelector: {}
tolerations:
annotations: {}
hostNetwork: false
podSecurityContext: {}
priorityClassName: “”

maxFailoverCount: 3
service:
type: NodePort
exposeStatus: true
separateSlowLog: true

slowLogTailer:
image: busybox:1.26.2
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 20m
memory: 5Mi

initializer:
resources: {}
plugin:
enable: false
directory: /plugins
list: [“allowlist-1”]
tlsClient:
enabled: false

mysqlClient:
image: tnir/mysqlclient
imagePullPolicy: IfNotPresent

busybox:
image: busybox:1.31.1
imagePullPolicy: IfNotPresent

monitor:
create: true
persistent: false
storageClassName: local-storage
storage: 10Gi
initializer:
image: pingcap/tidb-monitor-initializer:v4.0.8
imagePullPolicy: IfNotPresent
config:
K8S_PROMETHEUS_URL: http://prometheus-k8s.monitoring.svc:9090
resources: {}
reloader:
create: true
image: pingcap/tidb-monitor-reloader:v1.0.1
imagePullPolicy: IfNotPresent
service:
type: NodePort
portName: tcp-reloader
resources: {}
grafana:
create: true
image: grafana/grafana:6.1.6
imagePullPolicy: IfNotPresent
logLevel: info
resources:
limits: {}
requests: {}
username: admin
password: admin
config:
GF_AUTH_ANONYMOUS_ENABLED: “true”
GF_AUTH_ANONYMOUS_ORG_NAME: “Main Org.”
GF_AUTH_ANONYMOUS_ORG_ROLE: “Viewer”
service:
type: NodePort
portName: http-grafana
prometheus:
image: prom/prometheus:v2.18.1
imagePullPolicy: IfNotPresent
logLevel: info
resources:
limits: {}
requests: {}
service:
type: NodePort
portName: http-prometheus
reserveDays: 12
nodeSelector: {}
tolerations:

binlog:
pump:
create: false
replicas: 1
image: pingcap/tidb-binlog:v4.0.8
imagePullPolicy: IfNotPresent
logLevel: info
storageClassName: local-storage
storage: 20Gi
affinity: {}
tolerations:
syncLog: true
gc: 7
heartbeatInterval: 2
resources:
limits: {}
requests: {}
drainer:
create: false
image: pingcap/tidb-binlog:v4.0.8
imagePullPolicy: IfNotPresent
logLevel: info
storageClassName: local-storage
storage: 10Gi
affinity: {}
tolerations:
workerCount: 16
detectInterval: 10
disableDetect: false
disableDispatch: false
ignoreSchemas: “INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test”
initialCommitTs: 0
safeMode: false
txnBatch: 20
destDBType: file
mysql: {}
kafka: {}
resources:
limits: {}
requests: {}

scheduledBackup:
create: false
mydumperImage: pingcap/tidb-cloud-backup:20200229
mydumperImagePullPolicy: IfNotPresent
storageClassName: local-storage
storage: 100Gi
cleanupAfterUpload: false
schedule: “0 0 * * *”
suspend: false
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
startingDeadlineSeconds: 3600
backoffLimit: 6
restartPolicy: OnFailure
options: “-t 16 -r 10000 --skip-tz-utc --verbose=3”
tikvGCLifeTime: 720h
secretName: backup-secret
gcp: {}
ceph: {}
s3: {}

resources:
limits: {}
requests: {}
affinity: {}
tolerations:

importer:
create: false
image: pingcap/tidb-lightning:v4.0.8
imagePullPolicy: IfNotPresent
storageClassName: local-storage
storage: 200Gi
resources: {}
affinity: {}
tolerations:
pushgatewayImage: prom/pushgateway:v0.3.1
pushgatewayImagePullPolicy: IfNotPresent
config: |
log-level = “info”
[metric]
job = “tikv-importer”
interval = “15s”
address = “localhost:9091”

metaInstance: “{{ $labels.instance }}”
metaType: “{{ $labels.type }}”
metaValue: “{{ $value }}”

*【可能的原因】:可能是flannel网络未配置正确,但是在tidb-controller-manager的log中可以看到,10.103.83.254:2379拒绝了连接请求,但是这个ip并不是任何一个pod的ip
*【问题】:如何能够解决这个问题,或者检查哪些更多的log可以提供更多信息

  1. 从 TiDB Operator v1.1.0 开始我们不再使用 tidb-cluster chart 部署 TiDB 集群,而是直接使用 TidbCluster CR,可以参考文档 https://docs.pingcap.com/zh/tidb-in-kubernetes/stable/notes-tidb-operator-v1.1,example 可以参考 https://github.com/pingcap/tidb-operator/blob/master/examples/advanced/tidb-cluster.yaml
  2. 因为之前部署过,怀疑和数据有关,麻烦尝试把 TiDB 集群删掉,PV 恢复成 Available,一定要确认原来 PV 的数据已经被清理了,然后再重新部署 TiDB 集群

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。