K8S在IPv4环境下部署失败,pd容器一直不断重启

这么久了,今天重新看这个问题。
TiOperator是1.5.2 Tidb版本是6.5.8 k8s版本1.23.17
PD就是反复的重启,running->ContainerCreating->Pending->running
过一段时间,发现启动正常了,但是运气好的时候需要10分钟,运气不好几个小时还是这个状态。
并且debug模式不生效,开了debug模式,任然要自动重启。

对比了之前的使用的版本
TiOperator是v1.4.3 Tidb版本是6.5.0 k8s版本v1.20.7

启动失败日志:

[2024/07/11 12:38:48.620 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:48.620 +00:00] [INFO] [raft.go:1358] [“43c224d21a700f9b no leader at term 1584; dropping index reading msg”]
[2024/07/11 12:38:48.893 +00:00] [INFO] [raft.go:929] [“43c224d21a700f9b is starting a new election at term 1584”]
[2024/07/11 12:38:48.893 +00:00] [INFO] [raft.go:735] [“43c224d21a700f9b became pre-candidate at term 1584”]
[2024/07/11 12:38:48.893 +00:00] [INFO] [raft.go:830] [“43c224d21a700f9b received MsgPreVoteResp from 43c224d21a700f9b at term 1584”]
[2024/07/11 12:38:48.893 +00:00] [INFO] [raft.go:817] [“43c224d21a700f9b [logterm: 1584, index: 113074] sent MsgPreVote request to 51c5dd9dd1f119ae at term 1584”]
[2024/07/11 12:38:48.893 +00:00] [INFO] [raft.go:817] [“43c224d21a700f9b [logterm: 1584, index: 113074] sent MsgPreVote request to 738369e98f51ba86 at term 1584”]
[2024/07/11 12:38:49.120 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:49.664 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:49.665 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_RAFT_MESSAGE] [remote-peer-id=738369e98f51ba86] [rtt=0s] [error=“dial tcp: lookup basic-pd-2.basic-pd-peer.ns.svc on 10.96.0.10:53: dial udp 10.96.0.10:53: connect: network is unreachable”]
[2024/07/11 12:38:49.665 +00:00] [WARN] [probing_status.go:70] [“prober detected unhealthy status”] [round-tripper-name=ROUND_TRIPPER_SNAPSHOT] [remote-peer-id=738369e98f51ba86] [rtt=0s] [error=“dial tcp 100.108.11.194:2380: connect: connection refused”]
[2024/07/11 12:38:50.165 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:50.665 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:51.188 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:51.303 +00:00] [WARN] [util.go:163] [“apply request took too long”] [took=10.00007245s] [expected-duration=100ms] [prefix=“read-only range “] [request=“key:"/pd/7389506872184522927" range_end:"/pd/7389506872184522928" “] [response=] [error=“context deadline exceeded”]
[2024/07/11 12:38:51.303 +00:00] [INFO] [trace.go:152] [“trace[107107765] range”] [detail=”{range_begin:/pd/7389506872184522927; range_end:/pd/7389506872184522928; }”] [duration=10.000303807s] [start=2024/07/11 12:38:41.303 +00:00] [end=2024/07/11 12:38:51.303 +00:00] [steps=”["trace[107107765] ‘agreement among raft nodes before linearized reading’ (duration: 10.000092564s)"]”]
[2024/07/11 12:38:51.303 +00:00] [WARN] [retry_interceptor.go:62] [“retrying of unary invoker failed”] [target=endpoint://client-9be0eacd-99c7-4f44-9693-baf0775a244d/basic-pd-0.basic-pd-peer.ns.svc:2379] [attempt=0] [error=“rpc error: code = DeadlineExceeded desc = context deadline exceeded”]
[2024/07/11 12:38:51.303 +00:00] [WARN] [etcdutil.go:121] [“kv gets too slow”] [request-key=/pd/7389506872184522927] [cost=10.000653535s] [error=“context deadline exceeded”]
[2024/07/11 12:38:51.303 +00:00] [ERROR] [etcdutil.go:126] [“load from etcd meet error”] [key=/pd/7389506872184522927] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/07/11 12:38:51.303 +00:00] [ERROR] [server.go:1524] [“failed to initialize the global TSO allocator”] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/07/11 12:38:51.689 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:52.190 +00:00] [WARN] [v3_server.go:814] [“waiting for ReadIndex response took too long, retrying”] [sent-request-id=1124651556563639575] [retry-timeout=500ms]
[2024/07/11 12:38:52.338 +00:00] [WARN] [retry_interceptor.go:62] [“retrying of unary invoker failed”] [target=endpoint://client-9be0eacd-99c7-4f44-9693-baf0775a244d/basic-pd-0.basic-pd-peer.ns.svc:2379] [attempt=0] [error=“rpc error: code = DeadlineExceeded desc = context deadline exceeded”]
[2024/07/11 12:38:52.338 +00:00] [WARN] [v3_server.go:830] [“timed out waiting for read index response (local node might have slow network)”] [timeout=11s]
[2024/07/11 12:38:52.338 +00:00] [INFO] [server.go:1420] [“server is closed, return pd leader loop”]
[2024/07/11 12:38:52.338 +00:00] [INFO] [etcd.go:369] [“closing etcd server”] [name=basic-pd-0] [data-dir=/var/lib/pd] [advertise-peer-urls=“[http://basic-pd-0.basic-pd-peer.ns.svc:2380]”] [advertise-client-urls=“[http://basic-pd-0.basic-pd-peer.ns.svc:2379]”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.16:35694: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.66.209.200:59794: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [INFO] [server.go:1494] [“skipped leadership transfer; local server is not leader”] [local-member-id=43c224d21a700f9b] [current-leader-member-id=0]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:46178: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.32:42798: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.108.11.230:53432: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [INFO] [peer.go:333] [“stopping remote peer”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.66.209.204:58928: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.108.11.230:53380: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.66.209.200:59310: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.108.11.230:53412: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.108.11.247:39956: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.16:35692: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.66.209.204:58930: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.33:58116: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.33:57852: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [stream.go:291] [“closed TCP streaming connection with remote peer”] [stream-writer-type=“stream MsgApp v2”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [WARN] [stream.go:301] [“stopped TCP streaming connection with remote peer”] [stream-writer-type=“stream MsgApp v2”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [WARN] [grpclog.go:60] [“transport: http2Server.HandleStreams failed to read frame: read tcp 100.74.135.16:2379->100.74.135.32:42756: use of closed network connection”]
[2024/07/11 12:38:52.339 +00:00] [WARN] [stream.go:291] [“closed TCP streaming connection with remote peer”] [stream-writer-type=“stream Message”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [WARN] [stream.go:301] [“stopped TCP streaming connection with remote peer”] [stream-writer-type=“stream Message”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [INFO] [pipeline.go:86] [“stopped HTTP pipelining with remote peer”] [local-member-id=43c224d21a700f9b] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.339 +00:00] [INFO] [stream.go:459] [“stopped stream reader with remote peer”] [stream-reader-type=“stream MsgApp v2”] [local-member-id=43c224d21a700f9b] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.340 +00:00] [INFO] [stream.go:459] [“stopped stream reader with remote peer”] [stream-reader-type=“stream Message”] [local-member-id=43c224d21a700f9b] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.340 +00:00] [INFO] [peer.go:340] [“stopped remote peer”] [remote-peer-id=51c5dd9dd1f119ae]
[2024/07/11 12:38:52.340 +00:00] [INFO] [peer.go:333] [“stopping remote peer”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [WARN] [stream.go:291] [“closed TCP streaming connection with remote peer”] [stream-writer-type=“stream MsgApp v2”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [WARN] [stream.go:301] [“stopped TCP streaming connection with remote peer”] [stream-writer-type=“stream MsgApp v2”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [WARN] [stream.go:291] [“closed TCP streaming connection with remote peer”] [stream-writer-type=“stream Message”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [WARN] [stream.go:301] [“stopped TCP streaming connection with remote peer”] [stream-writer-type=“stream Message”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [INFO] [pipeline.go:86] [“stopped HTTP pipelining with remote peer”] [local-member-id=43c224d21a700f9b] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [INFO] [stream.go:459] [“stopped stream reader with remote peer”] [stream-reader-type=“stream MsgApp v2”] [local-member-id=43c224d21a700f9b] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [INFO] [stream.go:459] [“stopped stream reader with remote peer”] [stream-reader-type=“stream Message”] [local-member-id=43c224d21a700f9b] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.340 +00:00] [INFO] [peer.go:340] [“stopped remote peer”] [remote-peer-id=738369e98f51ba86]
[2024/07/11 12:38:52.353 +00:00] [INFO] [etcd.go:564] [“stopping serving peer traffic”] [address=“[::]:2380”]
[2024/07/11 12:38:52.353 +00:00] [INFO] [etcd.go:571] [“stopped serving peer traffic”] [address=“[::]:2380”]
[2024/07/11 12:38:52.353 +00:00] [INFO] [etcd.go:373] [“closed etcd server”] [name=basic-pd-0] [data-dir=/var/lib/pd] [advertise-peer-urls=“[http://basic-pd-0.basic-pd-peer.ns.svc:2380]”] [advertise-client-urls=“[http://basic-pd-0.basic-pd-peer.ns.svc:2379]”]
[2024/07/11 12:38:52.353 +00:00] [INFO] [manager.go:73] [“exit dashboard loop”]
[2024/07/11 12:38:52.353 +00:00] [INFO] [server.go:543] [“close server”]