tidb INSERT 写入慢

【 TiDB 使用环境】生产环境
【 TiDB 版本】v6.5.0
【遇到的问题:问题现象及影响】

生产环境的一个表,写入很慢,大批量的insert 慢查询,甚至1s + ,服务器无压力,无明显写热点


sql语句:

INSERT INTO qc_report_rds (id, advertiser_id, aweme_id, ad_id, creative_id, marketing_goal, order_platform, marketing_scene, promotion_way, creative_material_mode, image_mode, smart_bid_type, pricing_category, stat_cost, show_cnt, click_cnt, convert_cnt, pay_order_count, pay_order_amount, create_order_count, create_order_amount, prepay_order_count, prepay_order_amount, dy_follow, luban_live_enter_cnt, live_watch_one_minute_count, live_fans_club_join_cnt, luban_live_slidecart_click_cnt, luban_live_click_product_cnt, luban_live_comment_cnt, luban_live_share_cnt, luban_live_gift_cnt, luban_live_gift_amount, dy_share, dy_comment, dy_like, total_play, play_duration_3s, play_25_feed_break, play_50_feed_break, play_75_feed_break, play_over, stat_time, uid, creator, create_time, update_time, del_flag) VALUES (1642858101091893249, 1758606025911374, 62871145733, 1761881661499443, 1761881796423710, 2, 1, 1, 2, 3, 15, 0, 2, 0.03, 1, 0, 0, 0, 0.0, 0, 0.0, 0, 0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0, 0, 0, 1, 0, 0, 0, 0, 0, '2023-04-03 19:00:00.0', 'd603099454ee699432fda0563e4c5441', 0, '2023-04-03 19:54:22.0', '2023-04-03 19:54:28.785', 0);

执行计划:

	id      	task	estRows	operator info	actRows	execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 	memory 	disk
	Insert_1	root	0      	N/A          	0      	time:1.6s, loops:1, prepare: {total: 1.6s, auto_id_allocator: {rebase_cnt: 1, Get:{num_rpc:4, total_time:1.51ms}, commit_txn: {prewrite:5.79ms, get_commit_ts:1.43ms, commit:3.71ms, slowest_prewrite_rpc: {total: 0.006s, region_id: 241596, store: 172.24.196.8:20160, tikv_wall_time: 4.2ms, scan_detail: {get_snapshot_time: 6.17µs, rocksdb: {block: {cache_hit_count: 13}}}, write_detail: {store_batch_wait: 110.9µs, propose_send_wait: 0s, persist_log: {total: 847.2µs, write_leader_wait: 96.9µs, sync_log: 664.4µs, write_memtable: 4.4µs}, commit_log: 3.75ms, apply_batch_wait: 13.1µs, apply: {total:104.9µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 13.1µs, write_memtable: 42.7µs}}}, commit_primary_rpc: {total: 0.004s, region_id: 241596, store: 172.24.196.8:20160, tikv_wall_time: 3.4ms, scan_detail: {get_snapshot_time: 4.61µs, rocksdb: {block: {}}}, write_detail: {store_batch_wait: 868µs, propose_send_wait: 0s, persist_log: {total: 261.9µs, write_leader_wait: 95ns, sync_log: 230.5µs, write_memtable: 1.99µs}, commit_log: 2.38ms, apply_batch_wait: 15.9µs, apply: {total:44.1µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 15.9µs, write_memtable: 18.2µs}}}, region_num:1, write_keys:1, write_byte:55}}}, insert:108µs, commit_txn: {prewrite:6.62ms, slowest_prewrite_rpc: {total: 0.007s, region_id: 332115, store: 172.24.196.8:20160, tikv_wall_time: 5.85ms, scan_detail: {get_snapshot_time: 96.5µs, rocksdb: {block: {cache_hit_count: 9}}}, write_detail: {store_batch_wait: 1.75ms, propose_send_wait: 0s, persist_log: {total: 1.4ms, write_leader_wait: 167ns, sync_log: 766.5µs, write_memtable: 28.9µs}, commit_log: 3.6ms, apply_batch_wait: 59.9µs, apply: {total:102.2µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 59.9µs, write_memtable: 53.3µs}}}, region_num:7, write_keys:7, write_byte:636}	3.50 KB	N/A


【资源配置】

表结构:


有5个普通索引,1个联合唯一索引

【附件:截图/日志/监控】

磁盘io性能怎么样?有瓶颈吗?
写入的grafana监控发出来看看

问题发生在prepare阶段,说明在insert组织数据时候应该慢了,但是在执行计划的执行信息中竟然没有看到慢在哪里,好奇怪。如果可以的话,可以在慢的时候开下pprof看下耗时主要在什么地方。

直接在服务器上,插一条数据,看看慢不慢。

有耐心的话,可以参考sop排查一下


这是其中一台tikv的磁盘读写情况,都是独立的ssd磁盘,这个速率应该不是瓶颈


直接在服务器上可能复现不了,挺快的。同样简单的sql,有些快有些慢

看dashboard,问题大概率应该是在prewrite阶段。具体操作,我建议准备测试环境和测试数据表。
首先测试环境,同样的sql语句,测试他的速度(判断是不是算力或region的问题);
再次测试语句,新建一条row,减少写入数据,轮流使用6个索引(判断语句问题)。

我拍猜测可能是region的写入热点的概率大,


https://docs.pingcap.com/zh/tidb/stable/latency-breakdown#提交阶段

首先 把自增去掉,把索引去掉

这几个索引都挺常见的使用方式吧 :upside_down_face:

Performance Overview 面板 观察以下 Performance Overview 面板中数据库时间概览和 QPS 的数据:


https://docs.pingcap.com/zh/tidb/stable/performance-tuning-practices


关注下这个监控面板 ,看看是不是某个TIKV压力大
还有这个指标也大

之前遇到 写入频繁热点,导致集群不稳定,
情况严重下


这个指标就非常高

可以提供下DDL信息

看起来像是索引太多了,我这里也会有这个情况的,可以把索引删掉插入数据,插入完成后再建索引

上面有

有混合读写场景,删不了

insert 的平均QPS 290,update的平均QPS 1.7k,有其他的UPdate语句在执行吗?还是说update这么高是在维护索引?如果是测试环境,可以增减几个索引测试一下。

这个update qps不高的,正常的生产业务请求,有其他表,不是单独这个表在测试
50648e8ad73c8dd17825e62472eb912
ID是代码生成顺序递增唯一的,离散没问题,扎堆有可能,因为短时间内可能有同一批ID是在同一个号段的