两台8核32G的机子带的动每秒150的并发不?

武哥哥1 · 2023 年2 月 24 日 05:24

【 TiDB 使用环境】生产环境
【 TiDB 版本】
6.6.0
【复现路径】

【遇到的问题：问题现象及影响】
现公司两台8核32g的机子,一台装的tidb1个,tipd1个,另一台装的tikv3个,装tikv的机子cpu直接满了
想的是在第一台装tidb的机子上,扩容一个tikv,将另一台装tikv的机子上,缩容一个tikv,这样能解决问题吗?

还有就是tidb总是down掉,有部分慢查询,Coprocessor 执行耗时较长,但是查看执行时间不是索引的问题,应该是服务器资源的问题

	id                            	task     	estRows	operator info                                                                                                                                                        	actRows	execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     	memory  	disk
	Projection_8                  	root     	0.00   	tokvideos.videos.id, tokvideos.videos.mid                                                                                                                            	10000  	time:43.5s, loops:11, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              	163.4 KB	N/A
	└─Limit_14                    	root     	0.00   	offset:0, count:10000                                                                                                                                                	10000  	time:43.5s, loops:11                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               	N/A     	N/A
	  └─IndexLookUp_35            	root     	0.00   	                                                                                                                                                                     	10000  	time:43.5s, loops:10, index_task: {total_time: 43.5s, fetch_handle: 16.4s, build: 46.3ms, wait: 27s}, table_task: {total_time: 3m14.3s, num: 14, concurrency: 5}, next: {wait_index: 23.5ms, wait_table_lookup_build: 218.3µs, wait_table_lookup_resp: 43.5s}                                                                                                                                                                                                                                                                                                                                                                     	15.7 MB 	N/A
	    ├─IndexRangeScan_32(Build)	cop[tikv]	0.00   	table:v, index:mode_2(mode, status, ischeck, is_vip, ioscheck, createtime), range:(1 1 1 0 0 2023-02-19 12:34:15,1 1 1 0 0 +inf], keep order:true, desc, stats:pseudo	261604 	time:16.4s, loops:239, cop_task: {num: 67, max: 5.54s, min: 260.4µs, avg: 315ms, p95: 1.84s, max_proc_keys: 33760, p95_proc_keys: 17376, tot_proc: 19.3s, tot_wait: 1.53s, rpc_num: 67, rpc_time: 21.1s, copr_cache_hit_ratio: 0.49, build_task_duration: 29.7µs, max_distsql_concurrency: 2}, tikv_task:{proc max:5.39s, min:0s, avg: 385.6ms, p80:50ms, p95:1.84s, iters:506, tasks:67}, scan_detail: {total_process_keys: 195077, total_process_keys_size: 19507700, total_keys: 201402, get_snapshot_time: 589.4µs, rocksdb: {delete_skipped_count: 14345, key_skipped_count: 230087, block: {cache_hit_count: 647}}}       	N/A     	N/A
	    └─Selection_34(Probe)     	cop[tikv]	0.00   	isnull(tokvideos.videos.deletetime)                                                                                                                                  	140918 	time:3m14.3s, loops:162, cop_task: {num: 35, max: 43.5s, min: 377.9µs, avg: 3.95s, p95: 42.3s, max_proc_keys: 12104, p95_proc_keys: 10912, tot_proc: 2m14.7s, tot_wait: 3.29s, rpc_num: 35, rpc_time: 2m18.3s, copr_cache_hit_ratio: 0.26, build_task_duration: 2.6ms, max_distsql_concurrency: 5}, tikv_task:{proc max:42.8s, min:9ms, avg: 3.94s, p80:4.53s, p95:42.2s, iters:287, tasks:35}, scan_detail: {total_process_keys: 136239, total_process_keys_size: 53236007, total_keys: 211480, get_snapshot_time: 451.6µs, rocksdb: {delete_skipped_count: 45724, key_skipped_count: 296550, block: {cache_hit_count: 490074}}}	N/A     	N/A
	      └─TableRowIDScan_33     	cop[tikv]	0.00   	table:v, keep order:false, stats:pseudo                                                                                                                              	140918 	tikv_task:{proc max:42.8s, min:9ms, avg: 3.94s, p80:4.53s, p95:42.2s, iters:287, tasks:35}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         	N/A     	N/A

部署脚本如下

# 全局变量应用于所有部署，如果缺少特定的部署值，则将其用作部署的默认值。
global:
  # 运行tidb集群的用户.
  user: "tidb"
  # Group用于指定用户所属的组名(如果与user不相同)
  # group: "tidb"
  # 所管理集群服务器的SSH端口.
  ssh_port: 22
  # 集群部署文件、启动脚本、配置文件的存放路径。
  deploy_dir: "/tidb/tidb-deploy"
  # TiDB集群数据存储目录
  data_dir: "/tidb/tidb-data"
  # 支持值:amd64、arm64(默认值:amd64)
  arch: "amd64"
  # 资源控制用于限制实例的资源。
  # See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html
  # 支持使用实例级的“resource_control”覆盖全局的“resource_control”。
  resource_control:
  # See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#MemoryLimit=bytes
      # 内存限制
    memory_limit: "8G"
      # See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#CPUQuota=
      # 百分比指定相对于一个CPU上可用的总CPU时间，单元最多应该获得多少CPU时间。使用值> 100%在多个CPU上分配CPU时间。
      # Example: CPUQuota=200% ensures that the executed processes will never get more than two CPU time.
    cpu_quota: "400%"
      # See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#IOReadBandwidthMax=device%20bytes
  #   io_read_bandwidth_max: "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 100M"
  #   io_write_bandwidth_max: "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 100M"

# # 监视的变量应用于所有机器。
monitored:
  # TiDB集群中各节点上报系统信息的通信端口。
  node_exporter_port: 9100
  # blackbox_export通信端口，用于TiDB集群端口监控。
  blackbox_exporter_port: 9115
  # 监控组件部署文件、启动脚本和配置文件的存放路径.
  # deploy_dir: "/tidb-deploy/monitored-9100"
  # 监控组件的数据存放路径
  # data_dir: "/tidb-data/monitored-9100"
  # 监控组件的日志存放路径
  # log_dir: "/tidb-deploy/monitored-9100/log"

# 服务器配置用于指定TiDB组件的运行时配置
# 所有配置项都可以在TiDB文档中找到:
# - TiDB: https://pingcap.com/docs/stable/reference/configuration/tidb-server/configuration-file/
# - TiKV: https://pingcap.com/docs/stable/reference/configuration/tikv-server/configuration-file/
# - PD: https://pingcap.com/docs/stable/reference/configuration/pd-server/configuration-file/
# - TiFlash: https://docs.pingcap.com/tidb/stable/tiflash-configuration
# #
# # 所有配置项都使用点来表示层次结构, e.g:
# #   readpool.storage.use-unified-pool
# #           ^       ^
# # - example: https://github.com/pingcap/tiup/blob/master/embed/examples/cluster/topology.example.yaml
# 您可以通过实例级的' config '字段覆盖此配置。
server_configs:
  # tidb:
  tikv:
    # 是否使用统一的读取线程池处理存储请求
    readpool.storage.use-unified-pool: true
    # 是否使用统一的读取线程池（在 readpool.unified 中配置）处理协处理器请求
    readpool.coprocessor.use-unified-pool: true
    # 统一处理读请求的线程池最多的线程数量
    readpool.unified.max-thread-count: 2
    # 共享 block cache 的大小
    storage.block-cache.capacity: "3GB"
    # 存储容量，即允许的最大数据存储大小。如果没有设置，则使用当前磁盘容量。如果要将多个 TiKV 实例部署在同一块物理磁盘上，需要在 TiKV 配置中添加该参数
    raftstore.capacity: "50GB"
  pd:
    replication.enable-placement-rules: true
    replication.location-labels: ["host"]
  # tiflash:
  # tiflash-learner:
  # kvcdc:

# PD配置。
pd_servers:
  - host: 172.27.197.96
    # 服务器的SSH端口.
    # ssh_port: 22
    # PD服务器名称
    # name: "pd-1"
    # 用于连接TiDB服务器的通信端口。
    # client_port: 2379
    # PD Server节点之间的通信端口
    # peer_port: 2380
    # PD Server部署文件、启动脚本、配置文件存放目录
    # deploy_dir: "/tidb-deploy/pd-2379"
    # PD服务器数据存储目录。
    # data_dir: "/tidb-data/pd-2379"
    # PD服务器日志文件的存放目录
    # log_dir: "/tidb-deploy/pd-2379/log"
    # Numa节点绑定
    # numa_node: "0"
    # 下面的配置用于覆盖`server_configs.pd`的值.
    config:
    #   schedule.max-merge-region-size: 20
    #   schedule.max-merge-region-keys: 200000

# TiDB配置。
tidb_servers:
  - host: 172.27.197.96
    # ssh_port: 22
    # 访问TiDB集群端口
    port: 4000
    # TiDB服务器状态信息上报端口.
    status_port: 10080
    # TiDB服务器部署文件、启动脚本、配置文件存放目录。
    # deploy_dir: "/tidb-deploy/tidb-4000"
    # TiDB服务器日志文件存放路径
    # log_dir: "/tidb-deploy/tidb-4000/log"
    # 建议绑定numa节点
    # numa_node: "1"

# TiKV配置
tikv_servers:
  - host: 172.27.197.97
    port: 20160
    status_port: 20180
    # numa_node: "0"
    # 下面的配置用于覆盖server_configs.tikv
    config:
      log.level: warn
      server.labels:
        host: tikv1
  - host: 172.27.197.97
    port: 20161
    status_port: 20181
    # numa_node: "1"
    config:
      log.level: warn
      server.labels:
        host: tikv1
  - host: 172.27.197.97
    port: 20162
    status_port: 20182
    # numa_node: "0"
    config:
      log.level: warn
      server.labels:
        host: tikv1

# # 服务器配置用于指定TiFlash服务器的配置
tiflash_servers:
  # # The ip address of the TiFlash Server.
  - host: 172.27.197.97
    # tcp_port: 9000
    # http_port: 8123
    # # TiFlash raft service and coprocessor service listening address.
    # flash_service_port: 3930
    # # TiFlash Proxy service port.
    # flash_proxy_port: 20170
    # # Prometheus pulls TiFlash Proxy metrics port.
    # flash_proxy_status_port: 20292
    # # Prometheus pulls the TiFlash metrics port.
    # metrics_port: 8234

monitoring_servers:
  # # The ip address of the Monitoring Server.
  - host: 172.27.197.96
  
grafana_servers:
  # # The ip address of the Grafana Server.
  - host: 172.27.197.96

【资源配置】
两台8核32g的机子,一台装的tidb1个,tipd1个,另一台装的tikv3个,tiflash1个

【附件：截图/日志/监控】

啦啦啦啦啦 · 2023 年2 月 24 日 05:42

测试环境还是生产环境呢，生产环境的话机器太少了吧，也许用mysql做个主从高可用性和性能会更好

tidb狂热爱好者 · 2023 年2 月 24 日 05:42

你这问题我回答几个方面你关注一下

服务器必须全部用ssd硬盘
建议你两台机器一台用tiflash 一台装tikv 装tiflash的机器安装pd和tidb

Billmay表妹 · 2023 年2 月 24 日 05:46

他这个机器服务器跑不了 Tiflash

你这个机器不满足生产环境的部署要求呀
TiDB + PD 至少要 16 核 48G 以上，如果性能不是很满意，PD 和 TiDB 分开部署。

xingzhenxiang · 2023 年2 月 24 日 05:47

还是装MySQL吧，tidb也不是万能的

武哥哥1 · 2023 年2 月 24 日 05:47

主要不想去做mysql分库分表的配置
必须得加机器么

啦啦啦啦啦 · 2023 年2 月 24 日 05:49

目前这个拓扑挂一台机器不就凉凉了么，机器少的话也用不到分布式数据库的优势，也许单机的mysql不分库分表也比现在性能好

wzf0072 · 2023 年2 月 24 日 05:53

要加机器、加配置。这个配置TiDB测试环境的需求都没有达到。

tidb菜鸟一只 · 2023 年2 月 24 日 09:27

我觉得起码得3台吧，2台有点难

Kongdom · 2023 年2 月 24 日 13:28

这个配置建议使用mysql，业务规模扩大之后可以再用TiDB