使用过程中,因为一次服务器断电,导致3个pd中的2个pd的leveldb的manifest文件崩溃,manifest文件修复后,还是有一个pd老是起不来,下面是一下相关的基本信息。
kubectl get po -njinfan
NAME READY STATUS RESTARTS AGE
tidb-cluster-1605234515-discovery-86468cbbf8-nshvd 1/1 Running 1 4d1h
tidb-cluster-1605234515-monitor-9b8fc57b5-jvxgv 3/3 Running 3 4d1h
tidb-cluster-1605234515-pd-0 0/1 CrashLoopBackOff 820 2d22h
tidb-cluster-1605234515-pd-1 1/1 Running 3 4d
tidb-cluster-1605234515-pd-2 1/1 Running 0 2d23h
tidb-cluster-1605234515-tidb-0 2/2 Running 2 4d1h
tidb-cluster-1605234515-tidb-1 2/2 Running 0 4d1h
tidb-cluster-1605234515-tikv-0 1/1 Running 1 4d1h
tidb-cluster-1605234515-tikv-1 1/1 Running 1 4d1h
tidb-cluster-1605234515-tikv-2 1/1 Running 0 4d1h
kubectl logs -njinfan tidb-cluster-1605234515-pd-0
Name: tidb-cluster-1605234515-pd-0.tidb-cluster-1605234515-pd-peer.jinfan.svc
Address 1: 10.244.1.226 tidb-cluster-1605234515-pd-0.tidb-cluster-1605234515-pd-peer.jinfan.svc.cluster.local
nslookup domain tidb-cluster-1605234515-pd-0.tidb-cluster-1605234515-pd-peer.jinfan.svc.svc success
test---http://tidb-cluster-1605234515-discovery.jinfan.svc:10261/new/dGlkYi1jbHVzdGVyLTE2MDUyMzQ1MTUtcGQtMC50aWRiLWNsdXN0ZXItMTYwNTIzNDUxNS1wZC1wZWVyLmppbmZhbi5zdmM6MjM4MAo=
starting pd-server ...
/pd-server --data-dir=/var/lib/pd --name=tidb-cluster-1605234515-pd-0 --peer-urls=http://0.0.0.0:2380 --advertise-peer-urls=http://tidb-cluster-1605234515-pd-0.tidb-cluster-1605234515-pd-peer.jinfan.svc:2380 --client-urls=http://0.0.0.0:2379 --advertise-client-urls=http://tidb-cluster-1605234515-pd-0.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379 --config=/etc/pd/pd.toml --join=http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379,http://tidb-cluster-1605234515-pd-2.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:42] ["Welcome to Placement Driver (PD)"]
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:43] [PD] [release-version=v4.0.7]
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:44] [PD] [edition=Community]
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:45] [PD] [git-hash=8b0348f545611d5955e32fdcf3c57a3f73657d77]
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:46] [PD] [git-branch=heads/refs/tags/v4.0.7]
[2021/01/11 06:15:05.144 +00:00] [INFO] [util.go:47] [PD] [utc-build-time="2020-09-29 06:52:41"]
[2021/01/11 06:15:05.145 +00:00] [INFO] [metricutil.go:81] ["disable Prometheus push client"]
[2021/01/11 06:15:05.145 +00:00] [ERROR] [join.go:213] ["failed to open directory"] [error="[PD:os:ErrOSOpen]open /var/lib/pd/member: no such file or directory"]
2021/01/11 06:15:05.145 grpclog.go:45: [info] parsed scheme: "endpoint"
2021/01/11 06:15:05.145 grpclog.go:45: [info] ccResolverWrapper: sending new addresses to cc: [{http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379 0 <nil>} {http://tidb-cluster-1605234515-pd-2.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379 0 <nil>}]
2021/01/11 06:15:05.167 grpclog.go:60: [warning] grpc: addrConn.createTransport failed to connect to {http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 120.240.95.33:2379: connect: connection refused". Reconnecting...
{"level":"warn","ts":"2021-01-11T06:15:05.193Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-edc76111-4902-4530-b3b3-5376c0ceff52/tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: unhealthy cluster"}
[2021/01/11 06:15:05.193 +00:00] [FATAL] [main.go:94] ["join meet error"] [error="etcdserver: unhealthy cluster"] [stack="github.com/pingcap/log.Fatal\
\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.7/go/pkg/mod/github.com/pingcap/log@v0.0.0-20200511115504-543df19646ad/global.go:59\
main.main\
\t/home/jenkins/agent/workspace/build_pd_multi_branch_v4.0.7/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:94\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:203"]
通过pd-ctl查看的member
{
"header": {
"cluster_id": 6894429818718535692
},
"members": [
{
"name": "tidb-cluster-1605234515-pd-1",
"member_id": 312349629294863285,
"peer_urls": [
"http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2380"
],
"client_urls": [
"http://tidb-cluster-1605234515-pd-1.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379"
],
"deploy_path": "/",
"binary_version": "v4.0.7",
"git_hash": "8b0348f545611d5955e32fdcf3c57a3f73657d77"
},
{
"name": "tidb-cluster-1605234515-pd-2",
"member_id": 14664505328199855870,
"peer_urls": [
"http://tidb-cluster-1605234515-pd-2.tidb-cluster-1605234515-pd-peer.jinfan.svc:2380"
],
"client_urls": [
"http://tidb-cluster-1605234515-pd-2.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379"
],
"deploy_path": "/",
"binary_version": "v4.0.7",
"git_hash": "8b0348f545611d5955e32fdcf3c57a3f73657d77"
}
],
"leader": {
"name": "tidb-cluster-1605234515-pd-1",
"member_id": 312349629294863285,
"peer_urls": [
"http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2380"
],
"client_urls": [
"http://tidb-cluster-1605234515-pd-1.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379"
]
},
"etcd_leader": {
"name": "tidb-cluster-1605234515-pd-1",
"member_id": 312349629294863285,
"peer_urls": [
"http://tidb-cluster-1605234515-pd-3.tidb-cluster-1605234515-pd-peer.jinfan.svc:2380"
],
"client_urls": [
"http://tidb-cluster-1605234515-pd-1.tidb-cluster-1605234515-pd-peer.jinfan.svc:2379"
],
"deploy_path": "/",
"binary_version": "v4.0.7",
"git_hash": "8b0348f545611d5955e32fdcf3c57a3f73657d77"
}
}