为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
【概述】场景+问题概述
tidb2 pd2 tikv1 tiflash1(扩容)
【背景】做过哪些操作
【现象】业务和数据库现象
根据日志报错telnet 10080端口是OK的
【业务影响】
【TiDB 版本】
v4.0.11
【附件】
- 相关日志 和 监控
-
TiUP Cluster Display 信息
$ tiup cluster display test-cluster
Found cluster newer version:The latest version: v1.5.6
Local installed version: v1.4.1
Update current component: tiup update cluster
Update all components: tiup update --all
Starting component cluster
: /home/tidb/.tiup/components/cluster/v1.4.1/tiup-cluster display test-cluster
Cluster type: tidb
Cluster name: test-cluster
Cluster version: v4.0.11
SSH type: builtin
Dashboard URL: http://200.100.1.13:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
200.100.1.13:9093 alertmanager 200.100.1.13 9093/9094 linux/x86_64 Up /tidbdata/deploy/data.alertmanager /tidbdata/deploy
200.100.1.13:3000 grafana 200.100.1.13 3000 linux/x86_64 Up - /tidbdata/deploy
200.100.1.13:2379 pd 200.100.1.13 2379/2380 linux/x86_64 Up|UI /tidbdata/deploy/data.pd /tidbdata/deploy
200.100.1.17:2379 pd 200.100.1.17 2379/2380 linux/x86_64 Up|L /tidbdata/deploy/pd-2379/data /tidbdata/deploy/pd-2379
200.100.1.13:9090 prometheus 200.100.1.13 9090 linux/x86_64 Up /tidbdata/deploy/prometheus2.0.0.data.metrics /tidbdata/deploy
200.100.1.17:9090 prometheus 200.100.1.17 9090 linux/x86_64 Up /tidbdata/deploy/prometheus-9090/data /tidbdata/deploy/prometheus-9090
200.100.1.13:4000 tidb 200.100.1.13 4000/10080 linux/x86_64 Up - /tidbdata/deploy
200.100.1.17:4000 tidb 200.100.1.17 4000/10080 linux/x86_64 Up - /tidbdata/deploy/tidb-4000
200.100.1.17:19000 tiflash 200.100.1.17 19000/18123/13930/30170/10292/18234 linux/x86_64 Up /tidbdata/deploy2/data /tidbdata/deploy2/tiflash-9000
200.100.1.13:20160 tikv 200.100.1.13 20160/20180 linux/x86_64 Up /tidbdata/deploy/data /tidbdata/deploy
-
TiUP Cluster Edit Config 信息
-
TiDB- Overview 监控
- 对应模块日志(包含问题前后1小时日志)
vi tiflash_error.log
2021.09.23 10:25:22.024024 [ 32 ] pingcap.pd: write tso failed
2021.09.23 10:25:22.024152 [ 32 ] pd/oracle: update ts error: Exception: write tso failed
2021.09.23 10:25:22.024325 [ 31 ] pingcap.pd: get member failed: 14: failed to connect to all addresses
2021.09.23 10:25:22.024376 [ 31 ] pingcap.pd: failed to get cluster id by :http://200.100.1.13:2379
2021.09.23 10:25:22.025497 [ 31 ] pingcap.pd: failed to get cluster id by :http://200.100.1.17:2379
2021.09.23 10:25:22.025569 [ 31 ] pingcap.pd: Exception: failed to update leader
2021.09.23 10:25:23.500883 [ 4 ] pingcap.pd: get safe point failed: 2: rpc error: code = Unavailable desc = not leader
vi tiflash_cluster_manager.log
2021-09-23 10:51:17,180 root: can not get tiflash replica info from tidb: [(‘200.100.1.13:10080’, ReadTimeout(ReadTimeoutError(“HTTPConnectionPool(host=‘200.100.1.13’, port=10080): Read timed out. (read timeout=5)”,),))]
Traceback (most recent call last):
File “flash_cluster_manager.py”, line 286, in main
File “flash_cluster_manager.py”, line 129, in init
File “flash_cluster_manager.py”, line 29, in wrap_func
File “flash_cluster_manager.py”, line 238, in table_update
File “tidb_tools.py”, line 42, in db_flash_replica
Exception: can not get tiflash replica info from tidb: [(‘200.100.1.13:10080’, ReadTimeout(ReadTimeoutError(“HTTPConnectionPool(host=‘200.100.1.13’, port=10080): Read timed out. (read timeout=5)”,),))]