tidb 某个查询报错

lxzkenney · 2022 年4 月 29 日 02:41

tidb 5.0.3
查询报错，然后过了20多分钟自动好了，又报出来了，反复不稳定，不知道啥问题。
查询语句：
mysql> select count(*) countNum from sdhz_rpt.dws_sdb_crm_lp_add_wechat_clue_conver_full_d where stat_dt=‘2022-04-28’;
ERROR 1105 (HY000): get store failed: 2: invalid store ID 181496807, not found

补充一下这表的操作是这样的：
他这个是先查一下看看是多少，然后根据ID循环删除，删除成功后再看看是否按照删除条件删干净了，如果是的话再灌入，灌入之后再做表表的分析anaylse talbe。
现在是：删除成功后再看看是否按照删除条件删干净了
这步报错了。

后边又报出来了

Christophe · 2022 年4 月 29 日 02:44

集群状况是正常的？

lxzkenney · 2022 年4 月 29 日 02:48

正常，就是发现的这个查询有问题。其他可能也有，没人反馈应该不多。

h5n1 · 2022 年4 月 29 日 02:49

之前有不正常的缩容操作吧

db_user · 2022 年4 月 29 日 02:52

admin check table 是正常的么

lxzkenney · 2022 年4 月 29 日 03:03

大概10天前有缩容tiflash 节点的操作，是正常的下线流程，但是grafana上监控数据显示异常，pd-ctl 查看里面stores 已经没有tombstone状态的。下线完后，我执行了tiup cluster prune。后来大佬又让我
执行了一次 stores remove-tombstone，监控数据正常了。

10天前的操作，如果有问题应该早就爆出来了。那些表都是每天固定使用的一些表。

lxzkenney · 2022 年4 月 29 日 03:05

张雨齐0720 · 2022 年4 月 29 日 03:20

监控中看到所有TiKV/TiFlash都是正常在线吗？

lxzkenney · 2022 年4 月 29 日 03:34

是正常的

h5n1 · 2022 年4 月 29 日 03:54

从inforation_schema.tikv_store_status看下store_id,address 还包不包含报错的那个

lxzkenney · 2022 年4 月 29 日 06:37

没有了，之前是6个tiflash ，3月份下了一个，上上周下了一台，还剩4个。

lxzkenney · 2022 年4 月 29 日 06:42

又报了，直接查几条数据没事。 count 不行

张雨齐0720 · 2022 年4 月 29 日 07:21

报错了store ID 181496807,查查这个id是哪个存储节点的，看看状态吧

h5n1 · 2022 年4 月 29 日 07:24

pd-ctl region store 181496807 这个看下

lxzkenney · 2022 年4 月 29 日 08:04

h5n1 · 2022 年4 月 29 日 08:07

information_schema.cluster_info 和pd-ctl store 看看是否有报错的那个store id

lxzkenney · 2022 年4 月 29 日 08:14

没有这个store,
这是当前整个集群的store：

» store
{
“count”: 8,
“stores”: [
{
“store”: {
“id”: 1,
“address”: “xx.xx.xx227:20160”,

},
{
  "store": {
    "id": 11585778,
    "address": "xx.xx.xx193:20160",
   
},
{
  "store": {
    "id": 290158285,
   
},
{
  "store": {
    "id": 290158286,
    
},
{
  "store": {
    "id": 4,
   
  }
},
{
  "store": {
    "id": 5,
    "address": "xx.xx.xx186:20160",
   
  }
},
{
  "store": {
    "id": 2811498,
    
},
{
  "store": {
    "id": 60800510,
    "address": "xx.xx.xx183:3930",

]
}

h5n1 · 2022 年4 月 29 日 09:36

你多查几次看看报错的store id是不是保持那2个，还是说会随机变化的

哈喽沃德 · 2022 年4 月 29 日 12:18

检查一下是否有定时任务

wisdom · 2022 年4 月 29 日 12:32

检查看一下日志或者几点日志看看抖动出现的原因能不能找出来