tiup 部署集群出现 failed to send heartbeat

4.0-rc的版本,用tiup部署的。但我的集群才刚部署好。连数据库都没创建就报这个警告。是什么原因啊

以前部署3.0.12没有这个问题啊

你好,

烦请上传下 tikv 日志,需要看下 log 上下文

[2020/04/21 13:20:30.396 +08:00] [INFO] [printer.go:41] [“Welcome to TiDB.”] [“Release Version”=v4.0.0-rc] [“Git Commit Hash”=79db9e30ab8f98ac07c8ae55c66dfecc24b43d56] [“Git B ranch”=heads/refs/tags/v4.0.0-rc] [“UTC Build Time”=“2020-04-08 07:32:25”] [GoVersion=go1.13] [“Race Enabled”=false] [“Check Table Before Drop”=false] [“TiKV Min Version”=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306][2020/04/21 13:20:30.398 +08:00] [INFO] [printer.go:54] [“loaded config”] [config="{“host”:“0.0.0.0”,“advertise-address”:“10.3.87.202”,“port”:4000,“cors”:"",“st ore”:“tikv”,“path”:“10.3.87.221:2379,10.3.87.202:2379,10.3.87.34:2389”,“socket”:"",“lease”:“45s”,“run-ddl”:true,“split-table”:true,“token-limit”:1000,“oom-use-tmp-storage”:true,“tmp-storage-path”:"/tmp/tidb/tmp-storage",“oom-action”:“cancel”,“mem-quota-query”:1073741824,“enable-streaming”:false,“enable-batch-dml”:false,“lower-case-table-names”:2,“server-version”:"",“log”:{“level”:“warn”,“format”:“text”,“disable-timestamp”:null,“enable-timestamp”:null,“disable-error-stack”:null,“enable-error-stack”:null,“file”:{“filename”:"/tidb/app/deploy/tidb-4000/log/tidb.log",“max-size”:300,“max-days”:0,“max-backups”:0},“enable-slow-log”:true,“slow-query-file”:“log/tidb_slow_query.log”,“slow-threshold”:300,“expensive-threshold”:10000,“query-log-max-len”:4096,“record-plan-in-slow-log”:1},“security”:{“skip-grant-table”:false,“ssl-ca”:"",“ssl-cert”:"",“ssl-key”:"",“require-secure-transport”:false,“cluster-ssl-ca”:"",“cluster-ssl-cert”:"",“cluster-ssl-key”:"",“cluster-verify-cn”:null},“status”:{“status-host”:“0.0.0.0”,“metrics-addr”:"",“status-port”:10080,“metrics-interval”:15,“report-status”:true,“record-db-qps”:false},“performance”:{“max-procs”:0,“max-memory”:0,“stats-lease”:“3s”,“stmt-count-limit”:5000,“feedback-probability”:0.05,“query-feedback-limit”:1024,“pseudo-estimate-ratio”:0.8,“force-priority”:“NO_PRIORITY”,“bind-info-lease”:“3s”,“txn-total-size-limit”:104857600,“tcp-keep-alive”:true,“cross-join”:true,“run-auto-analyze”:true},“prepared-plan-cache”:{“enabled”:false,“capacity”:100,“memory-guard-ratio”:0.1},“opentracing”:{“enable”:false,“rpc-metrics”:false,“sampler”:{“type”:“const”,“param”:1,“sampling-server-url”:"",“max-operations”:0,“sampling-refresh-interval”:0},“reporter”:{“queue-size”:0,“buffer-flush-interval”:0,“log-spans”:false,“local-agent-host-port”:""}},“proxy-protocol”:{“networks”:"",“header-timeout”:5},“tikv-client”:{“grpc-connection-count”:4,“grpc-keepalive-time”:10,“grpc-keepalive-timeout”:3,“commit-timeout”:“41s”,“max-batch-size”:128,“overload-threshold”:200,“max-batch-wait-time”:0,“batch-wait-size”:8,“enable-chunk-rpc”:true,“region-cache-ttl”:600,“store-limit”:0,“copr-cache”:{“enabled”:false,“capacity-mb”:0,“admission-max-result-mb”:0,“admission-min-process-ms”:0}},“binlog”:{“enable”:false,“ignore-error”:false,“write-timeout”:“15s”,“binlog-socket”:"",“strategy”:“range”},“compatible-kill-query”:false,“plugin”:{“dir”:"",“load”:""},“pessimistic-txn”:{“enable”:true,“max-retry-count”:256},“check-mb4-value-in-utf8”:true,“max-index-length”:3072,“alter-primary-key”:false,“treat-old-version-utf8-as-utf8mb4”:true,“enable-table-lock”:false,“delay-clean-table-lock”:0,“split-region-max-num”:1000,“stmt-summary”:{“enable”:true,“enable-internal-query”:false,“max-stmt-count”:200,“max-sql-length”:4096,“refresh-interval”:1800,“history-size”:24},“repair-mode”:false,“repair-table-list”:[],“isolation-read”:{“engines”:[“tikv”,“tiflash”,“tidb”]},“max-server-connections”:0,“new_collations_enabled_on_first_bootstrap”:false,“experimental”:{“allow-auto-random”:false,“allow-expression-index”:false},“enable-dynamic-config”:false}"][2020/04/21 13:20:30.726 +08:00] [WARN] [session.go:1044] [“run statement failed”] [schemaVersion=0] [error="[schema:1049]Unknown database ‘mysql’"] [session="{\n “currDBNam e”: “”,\n “id”: 0,\n “status”: 2,\n “strictMode”: true,\n “user”: null\n}"][2020/04/21 13:20:30.727 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:30.928 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:31.128 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:31.329 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:31.530 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:31.731 +08:00] [WARN] [session.go:1138] [“compile SQL failed”] [error="[schema:1146]Table ‘mysql.tidb’ doesn’t exist"] [SQL=“SELECT HIGH_PRIORITY VARIABLE_VA LUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped””][2020/04/21 13:20:34.029 +08:00] [WARN] [domain.go:506] [“reload schema in loop, schema syncer need rewatch”] [2020/04/21 13:20:34.322 +08:00] [WARN] [session.go:464] [sql] [label=internal] [error="[kv:9007]Write conflict, txnStartTS=416139558066323488, conflictStartTS=416139558066323 477, conflictCommitTS=416139558079430657, key={tableID=17, indexID=1, indexValues={tikv_gc_leader_uuid, }} primary={tableID=17, indexID=1, indexValues={tikv_gc_leader_uuid, }} [try again later]"] [txn=“Txn{state=invalid}”][2020/04/21 13:20:34.322 +08:00] [WARN] [session.go:661] [retrying] [schemaVersion=22] [retryCnt=0] [queryNum=0] [sql=“INSERT HIGH_PRIORITY INTO mysql.tidb VALUES (‘tikv_gc_le ader_uuid’, ‘5c66cb927880019’, ‘Current GC worker leader UUID. (DO NOT EDIT)’) ON DUPLICATE KEY UPDATE variable_value = ‘5c66cb927880019’, comment = ‘Current GC worker leader UUID. (DO NOT EDIT)’”][2020/04/21 13:20:34.324 +08:00] [WARN] [session.go:685] [“transaction association”] [“retrying txnStartTS”=416139558079430664] [“original txnStartTS”=416139558066323488] [2020/04/21 13:20:34.369 +08:00] [WARN] [session.go:464] [sql] [label=internal] [error="[kv:9007]Write conflict, txnStartTS=416139558092537857, conflictStartTS=416139558079430 708, conflictCommitTS=416139558092537858, key={tableID=17, indexID=1, indexValues={tikv_gc_safe_point, }} primary={tableID=17, indexID=1, indexValues={tikv_gc_safe_point, }} [try again later]"] [txn=“Txn{state=invalid}”][2020/04/21 13:20:34.369 +08:00] [WARN] [session.go:661] [retrying] [schemaVersion=22] [retryCnt=0] [queryNum=0] [sql=“INSERT HIGH_PRIORITY INTO mysql.tidb VALUES (‘tikv_gc_sa fe_point’, ‘20200421-13:10:34 +0800’, ‘All versions after safe point can be accessed. (DO NOT EDIT)’) ON DUPLICATE KEY UPDATE variable_value = ‘20200421-13:10:34 +0800’, comment = ‘All versions after safe point can be accessed. (DO NOT EDIT)’”][2020/04/21 13:20:34.371 +08:00] [WARN] [session.go:685] [“transaction association”] [“retrying txnStartTS”=416139558092537860] [“original txnStartTS”=416139558092537857]

tidb.log (7.5 KB)

你好,

这边没有从日志中发现标题中的错误,

请检查系统表 mysql 是否存在

这个东西我看到了。mysql.tidb是存在的

SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped”

返回true

而且我只用tiup执行了部署命令。别的什么都没做不应该有这个错误啊

你好,

检查下目录是否和已有集群冲突?

判断下尝试重新部署下是否可行

我现在重部署集群试一下

辛苦~

tiup cluster destroy tidb-test

并将所有机的data deploy log三个目录都删除了, 重新部署还是有告警。日志还是有这样的错误, 我只有一个集群

这个告警我在用3.0.12的时候是没有的。那个错误日志没注意

警告的截图麻烦发一下,日志中是否还是出现 mysql.tidb 不存在?

tidb.log仍然有的错误

["compile SQL failed"] [error="[schema:1146]Table 'mysql.tidb' doesn't exist"] [SQL="SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=\"bootstrapped\""]

我还在tikv.log找到这样的错误

[2020/04/21 15:21:38.249 +08:00] [ERROR] [client.rs:353] ["failed to send heartbeat"] [err="Grpc(RpcFinished(Some(RpcStatus { status: 0-OK, details: None })))"]
[2020/04/21 15:21:38.249 +08:00] [ERROR] [util.rs:315] ["request failed, retry"] [err="Grpc(RpcFinished(Some(RpcStatus { status: 0-OK, details: None })))"]
[2020/04/21 15:21:38.249 +08:00] [ERROR] [util.rs:315] ["request failed, retry"] [err="Other(SendError(\"...\"))"]
[2020/04/21 15:21:38.249 +08:00] [ERROR] [util.rs:315] ["request failed, retry"] [err="Other(SendError(\"...\"))"]

pd.log里有这样的错误

[2020/04/21 15:25:04.940 +08:00] [ERROR] [grpclog.go:75] ["transport: Got too many pings from the client, closing the connection."]
[2020/04/21 15:25:04.941 +08:00] [ERROR] [grpclog.go:75] ["transport: loopyWriter.run returning. Err: transport: Connection closing"]

不过这个在3.0.12里好像也看到过

你好,

display 看下集群是否都是 up 状态,

检查下 pd 与 tikv 之间是否存在网络问题,通不通。

我看了好几次状态都是up的

网络也是通的。

你好,

如果这边检查都没有问题,不知道目前影响测试不,建议观察下

那好吧

:ok_hand:

首次启动集群时,下面的 warning

[2020/04/23 11:53:06.038 +08:00] [WARN] [session.go:1138] ["compile SQL failed"] [error="[schema:1146]Table 'mysql.tidb' doesn't exist"] [SQL="SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=\"bootstrapped\""]

属于正常情况,会在第一次 bootstrap 过程中创建系统表。

你后面执行能成功是因为首次启动成功后,这些系统表已经创建了

SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=“bootstrapped” 返回true

也证实这个结论

另外标题中的 failed to send heartbeat 是日志中出现还是什么地方。

在tikv日志里有 还有。请看一下。我一部署集群。就有最后一个告警(截图)