tidb 日志报 load privilege loop watch channel closed

【TiDB 版本】:

3.0.5

【问题描述】:

请问这个异常表示是什么含义, [ERROR] [domain.go:761] ["load privilege loop watch channel closed"] 引发的原因都有哪些?

TiKV.log

[2020/01/07 18:04:29.998 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_PROCESS] resp_time:384.664609ms txnStartTS:413765852919496705 region_id:45013 store_addr:192.168.180.52:20160"]
[2020/01/07 18:04:30.044 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_PROCESS] resp_time:431.027331ms txnStartTS:413765852919496705 region_id:43986 store_addr:192.168.180.51:20160"]
[2020/01/07 18:04:30.053 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_PROCESS] resp_time:414.253078ms txnStartTS:413765852919496705 region_id:43986 store_addr:192.168.180.51:20160"]
[2020/01/07 18:04:30.063 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_PROCESS] resp_time:424.800539ms txnStartTS:413765852919496705 region_id:45013 store_addr:192.168.180.52:20160"]
[2020/01/07 18:04:30.067 +08:00] [INFO] [server.go:416] ["connection closed"] [conn=54]
[2020/01/07 18:04:30.832 +08:00] [INFO] [ddl_worker.go:113] ["[ddl] DDL worker closed"] [worker="worker 1, tp general"] ["take time"=2.731µs]
[2020/01/07 18:04:30.832 +08:00] [INFO] [ddl_worker.go:113] ["[ddl] DDL worker closed"] [worker="worker 2, tp add index"] ["take time"=4.913µs]
[2020/01/07 18:04:30.832 +08:00] [INFO] [session_pool.go:85] ["[ddl] closing sessionPool"]
[2020/01/07 18:04:30.832 +08:00] [INFO] [delete_range.go:123] ["[ddl] closing delRange"]
[2020/01/07 18:04:30.832 +08:00] [INFO] [ddl.go:491] ["[ddl] DDL closed"] [ID=7fbf5dba-8b1f-4f45-b5ff-e87f70199492] ["take time"=2.498365ms]
[2020/01/07 18:04:30.832 +08:00] [INFO] [ddl.go:402] ["[ddl] stop DDL"] [ID=7fbf5dba-8b1f-4f45-b5ff-e87f70199492]
[2020/01/07 18:04:30.832 +08:00] [INFO] [manager.go:267] ["failed to campaign"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"] [error="context canceled"]
[2020/01/07 18:04:30.832 +08:00] [INFO] [manager.go:248] ["break campaign loop, context is done"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"]
[2020/01/07 18:04:30.834 +08:00] [ERROR] [domain.go:761] ["load privilege loop watch channel closed"] [stack="github.com/pingcap/tidb/domain.(*Domain).LoadPrivilegeLoop.func1\n\tgithub.com/pingcap/tidb@/domain/domain.go:761"]
[2020/01/07 18:04:30.834 +08:00] [INFO] [manager.go:267] ["failed to campaign"] ["owner info"="[stats] /tidb/stats/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"] [error="lost watcher waiting for delete"]
[2020/01/07 18:04:30.834 +08:00] [INFO] [manager.go:239] ["etcd session is done, creates a new one"] ["owner info"="[stats] /tidb/stats/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"]
[2020/01/07 18:04:30.834 +08:00] [WARN] [manager.go:170] ["failed to new session to etcd"] [ownerInfo="[stats] /tidb/stats/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"] [error="context canceled"]
[2020/01/07 18:04:30.834 +08:00] [INFO] [manager.go:292] ["revoke session"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 7fbf5dba-8b1f-4f45-b5ff-e87f70199492"] [error="context canceled"]
[2020/01/07 18:04:30.921 +08:00] [INFO] [domain.go:554] ["domain closed"] ["take time"=90.899519ms]
[2020/01/07 18:04:30.921 +08:00] [INFO] [gc_worker.go:187] ["[gc worker] quit"] [uuid=5bdfdb97a64000e]

参考下这篇文章看下 http://winkyao.com/2017/09/29/tidb-bootstarp/

1赞
因为 TiDB 是分布式的, 所以在 TiDB 中的 privilege 可能被其他的 TiDB 修改了, 所以 TiDB 使用了 etcd, 在 更改了 privilege 的时候通知所有的tidb 去 load 新的 priilege

现在的引发的原因我知道了,但会对数据有什么影响吗?

初步查看是没有影响的,可以观察下业务情况。目前 tidb 日志打出的信息量比较多,后续也有改进的哈。

TIME_COP_PROCESS 这种日志 比较多,是因为 请求 TiKV 查询的时间 比较长引起的对吗?

嗯,对于查询时间比较长的,会记录 TIME_COP_PROCESS 日志

那也就是说, 有这样的信息,就是用 慢SQL了,语句 可以从 慢SQL找到的,对吧

得看慢日志的阈值,可以尝试看下慢日志

有这么多慢SQL 但是这些语句单独执行时 单个时间大概在 5秒左右, 为什么循环执行时,会越来越慢呢? SQL 语句如下

EXPLAIN ANALYZE
SELECT
	* 
FROM
	(
	SELECT
		code1,
		code2,
		normal1,
		seller_date,
		ROW_NUMBER ( ) OVER ( ORDER BY id ) AS row_num 
	FROM
		dc_f_sale 
	WHERE
		Is_Del = '0' 
	) t 
WHERE
	t.row_num > 1570000 
	LIMIT 10000;

这个问题可以新开一个帖子,提供一下语句以及 explain analyze 执行计划的结果

循环执行的 sql 条件都是一样的吗,还是条件也发生了变更

老师以创建新帖子,麻烦看一下,感谢

好的,我看下