tidb4.0.1中同一条sql偶发性无结果返回,为什么会出现这种情况呢?
请问是什么意思? 能够详细描述下吗? 方便举个例子吗? 麻烦了,多谢。
就是一条这样的语句:
select * from a a left join b b on a.id=b.aid and a.status=1 where a.xxid=xxx and a.xxxid=xxxx;
然后我现在在tikv中看到和无结果返回的sql时间(精确到秒)对得上的日志[2020/06/30 10:58:23.114 +08:00] [INFO] [process.rs:145] [“get snapshot failed”] [err=“Request(message: “\”[components/raftstore/src/store/util.rs:267]: mismatch peer id 267291 != 267314\”")"] [cid=168799190]
[2020/06/30 10:58:23.138 +08:00] [INFO] [process.rs:145] [“get snapshot failed”] [err=“Request(message: “region 261382 is missing” region_not_found { region_id: 261382 })”] [cid=168799195]
[2020/06/30 10:58:23.229 +08:00] [INFO] [process.rs:145] [“get snapshot failed”] [err=“Request(message: “region 261382 is missing” region_not_found { region_id: 261382 })”] [cid=168799196][cid=168799195],我猜测很大概率是这个影响到,但是不是很确定
请问,这条sql 有并发执行吗? 还是单条 sql 循环执行?
并发执行的时候发现这个问题。然后单条循环执行也复现了。
- 麻烦您帮忙反馈下 explain analyze sql 的执行计划,辛苦了。
- 循环大概多少次,就会出现呢?
Projection_6 root 0.8780977734753147 course.users.id, course.users.username, course.users.phone, course.users.role, course.users.verify_code, course.users.verify_time, course.users.is_verified, course.users.avatar_oss_id, course.users.access_token, course.users.is_staff, course.users.status, course.users.school_id, course.users.school_name, course.users.score, course.users.grade, course.users.sex, course.users.real_name, course.users.create_time, course.users.is_destroy, course.users.destroy_time, course.users.register_activity, course.users.register_source, course.users.ppt_id, course.users.member_id, course.users.secret, course.users.hash, course.users_official_account.id, course.users_official_account.wx_open_id, course.users_official_account.wx_username, course.users_official_account.wx_avatar, course.users_official_account.wx_phone
└─IndexMergeJoin_18 root 0.8780977734753147 left outer join, inner:Projection_16, outer key:course.users.id, inner key:course.users_official_account.user_id
├─Selection_24 root 0.7024782187802517 eq(course.users.role, 3)
│ └─Point_Get_23 root 1 table:users, index:udx_memberid(member_id)
└─Projection_16 root 1.25 course.users_official_account.id, course.users_official_account.user_id, course.users_official_account.wx_username, course.users_official_account.wx_phone, course.users_official_account.wx_open_id, course.users_official_account.wx_avatar, course.users_official_account.status
└─IndexLookUp_15 root 1.25
├─IndexScan_12 cop 1250 table:official_a, index:user_openid(user_id, wx_open_id), range: decided by [eq(course.users_official_account.user_id, course.users.id)], keep order:true, stats:pseudo
└─Selection_14 cop 1.25 eq(course.users_official_account.status, 1)
└─TableScan_13 cop 1250 table:official_a, keep order:false, stats:pseudo
我直接手动工具连执行个10-20次左右都会出现一次。有时候会次数要多点才能复现。
能麻烦您多执行几次 explain analyze 吗; 看下当 actRows 是 0 的时候,也就是没结果时,explain analyze 是什么,麻烦了
我取消了查询条件中的唯一索引然后正常。我现在加回去看看。
现在复现不了了。 有没有办法可以看到索引加的有问题的? 除了通过压测之外(压测必现)。
麻烦上传下执行计划,我们看看是否执行计划改变了,导致无法复现,多谢。
Projection_6 root 0.8711923201032591 course.users.id, course.users.username, course.users.phone, course.users.role, course.users.verify_code, course.users.verify_time, course.users.is_verified, course.users.avatar_oss_id, course.users.access_token, course.users.is_staff, course.users.status, course.users.school_id, course.users.school_name, course.users.score, course.users.grade, course.users.sex, course.users.real_name, course.users.create_time, course.users.is_destroy, course.users.destroy_time, course.users.register_activity, course.users.register_source, course.users.ppt_id, course.users.member_id, course.users.secret, course.users.hash, course.users_official_account.id, course.users_official_account.wx_open_id, course.users_official_account.wx_username, course.users_official_account.wx_avatar, course.users_official_account.wx_phone
└─IndexMergeJoin_19 root 0.8711923201032591 left outer join, inner:Projection_17, outer key:course.users.id, inner key:course.users_official_account.user_id
├─IndexLookUp_40 root 0.6969538560826073
│ ├─IndexScan_37 cop 1 table:user, index:udx_memberid(member_id), range:["93046C02-E9DA-4D4B-9DC4-EA8E8EE9490F","93046C02-E9DA-4D4B-9DC4-EA8E8EE9490F"], keep order:false
│ └─Selection_39 cop 0.6969538560826073 eq(course.users.role, 3)
│ └─TableScan_38 cop 1 table:user, keep order:false
└─Projection_17 root 1.25 course.users_official_account.id, course.users_official_account.user_id, course.users_official_account.wx_username, course.users_official_account.wx_phone, course.users_official_account.wx_open_id, course.users_official_account.wx_avatar, course.users_official_account.status
└─IndexLookUp_16 root 1.25
├─IndexScan_13 cop 1250 table:official_a, index:user_openid(user_id, wx_open_id), range: decided by [eq(course.users_official_account.user_id, course.users.id)], keep order:true, stats:pseudo
└─Selection_15 cop 1.25 eq(course.users_official_account.status, 1)
└─TableScan_14 cop 1250 table:official_a, keep order:false, stats:pseudo
这边应该是走到 pointget时遇到的问题,4.0.2会修复一个类似问题,应该是一致的,麻烦到时升级看看,多谢。
是否执行计划涉及到pointget的都会有影响?
应该是涉及到join 并发,pointget 才有问题
那除了删索引加索引来改变执行计划还有其他令join表的执行计划正常(避开pointget)的方法吗?
除非您这边加hint了,找一种执行计划
point get 作为并行算子的孩子节点的时候会有可能触发
在 4.0.2 版本里面应该修复掉了
此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。