semi join 的执行计算解析bug ?

Bug 反馈
一个比较简单的语句:

SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
WHERE data_source = 'EAST'

直接报错, Unknown column ‘order_id’ in ‘field list’

 SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         WHERE order_id NOT IN (
             SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         )
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
		   LIMIT 10

执行计划如下:

id,estRows,actRows,task,access object,execution info,operator info,memory,disk
Projection_11,10.00,0,root,,"time:5.31s, loops:1, RU:11.888240, Concurrency:OFF","fin_dws.dws_fin2_global_acv_1d.account_id, fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, fin_dws.dws_fin2_global_acv_1d.contract_no, fin_dws.dws_fin2_global_acv_1d.lcy_total_acv",9.14 KB,N/A
└─Limit_14,10.00,0,root,,"time:5.31s, loops:1","offset:0, count:10",N/A,N/A
  └─HashJoin_15,10.00,0,root,,"time:5.31s, loops:1, build_hash_table:{total:11ms, fetch:9.77ms, build:1.23ms}, probe:{concurrency:16, total:1m9.7s, max:5.31s, probe:1m8.8s, fetch:899.3ms}",CARTESIAN anti semi join,438.4 KB,0 Bytes
    ├─TableReader_21(Build),12426.00,12426,root,,"time:10.1ms, loops:14, cop_task: {num: 7, max: 2.66ms, min: 1.23ms, avg: 1.55ms, p95: 2.66ms, tot_proc: 11µs, tot_wait: 1.73ms, rpc_num: 7, rpc_time: 10.7ms, copr_cache_hit_ratio: 1.00, build_task_duration: 6µs, max_distsql_concurrency: 1}",data:Selection_20,95.9 KB,N/A
    │ └─Selection_20,12426.00,12426,cop[tikv],,"tikv_task:{proc max:20ms, min:0s, avg: 7.43ms, p80:12ms, p95:20ms, iters:39, tasks:7}, scan_detail: {get_snapshot_time: 1.38ms, rocksdb: {block: {}}}","eq(fin_dws.dws_fin_east_multi_year_acv_lcy.data_source, ""EAST"")",N/A,N/A
    │   └─TableFullScan_19,12426.00,12426,cop[tikv],table:dws_fin_east_multi_year_acv_lcy,"tikv_task:{proc max:20ms, min:0s, avg: 7.43ms, p80:12ms, p95:20ms, iters:39, tasks:7}",keep order:false,N/A,N/A
    └─TableReader_18(Probe),12.50,22239,root,,"time:62.8ms, loops:23, cop_task: {num: 18, max: 40.9ms, min: 502.5µs, avg: 8.66ms, p95: 40.9ms, tot_proc: 28.7µs, tot_wait: 4.89ms, rpc_num: 18, rpc_time: 155.3ms, copr_cache_hit_ratio: 1.00, build_task_duration: 17.5µs, max_distsql_concurrency: 3}",data:Selection_17,830.3 KB,N/A
      └─Selection_17,12.50,22239,cop[tikv],,"tikv_task:{proc max:120ms, min:0s, avg: 35.6ms, p80:68ms, p95:120ms, iters:141, tasks:18}, scan_detail: {get_snapshot_time: 4.05ms, rocksdb: {block: {}}}","eq(fin_dws.dws_fin2_global_acv_1d.type, ""EAST""), or(eq(cast(fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, double BINARY), 2023), eq(cast(fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, double BINARY), 2024))",N/A,N/A
        └─TableFullScan_16,73.55,70424,cop[tikv],table:t,"tikv_task:{proc max:112ms, min:0s, avg: 34.2ms, p80:64ms, p95:112ms, iters:141, tasks:18}",keep order:false,N/A,N/A

所以CARTESIAN anti semi join 这里出现了问题, 没有识别到.

换一个写法:

SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         LEFT JOIN (
             SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         ) multi
		    ON t.order_id = multi.order_id
		 WHERE multi.order_id IS NULL
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
		   LIMIT 10

这样就不会出错.

表DDL文件:
dws_fin_east_multi_year_acv_lcy.ddl.sql (2.7 KB)
dws_fin2_global_acv_1d.ddl.sql (7.9 KB)

【 TiDB 版本】
7.5.1
【 Bug 的影响】
影响统计结果准确性.
【可能的问题复现步骤】
简单的SQL 可以复现:

-- NOT IN 子查询
WITH t1 AS (
	SELECT 1 AS n
),

t2 AS (
  SELECT 2 AS b 
  )
  
SELECT *
FROM t1
WHERE n  NOT IN (SELECT n FROM t2);

以及In 子查询

 WITH t1 AS (
	SELECT 1 AS n
),

t2 AS (
  SELECT 2 AS b 
  )
  
SELECT *
FROM t1
WHERE n   IN (SELECT n FROM t2)

【看到的非预期行为】
语句可以顺利执行不报错, 产生意外结果.
【期望看到的行为】
语法错误
【相关组件及具体版本】
tidb
【其他背景信息或者截图】
如集群拓扑,系统和内核版本,应用 app 信息等;如果问题跟 SQL 有关,请提供 SQL 语句和相关表的 Schema 信息;如果节点日志存在关键报错,请提供相关节点的日志内容或文件;如果一些业务敏感信息不便提供,请留下联系方式,我们与您私下沟通。

表结构来下呢?

多谢~~ 好的, 更新到帖子里了 @WalterWj

(root@127.0.0.1) [fin_dws]>CREATE TABLE fin_dws.dws_fin_east_multi_year_acv_lcy
    -> (
    ->     date_time                 DATETIME COMMENT '更新日期',
    ->     data_source               VARCHAR(20) COMMENT '东西方分类',
    ->     data_source_lv2               VARCHAR(20) COMMENT '东西方二级分类',
    ->     contract_no               VARCHAR(32)    NOT NULL COMMENT '合同编号',
    ->     account_id                VARCHAR(32)    NOT NULL COMMENT '客户ID',
    ->     account_name              VARCHAR(128)   NOT NULL COMMENT '客户名称',
    ->    `account_owner_name`            VARCHAR(256) COMMENT '客户负责人名称',
    ->     contract_type             VARCHAR(32)    NOT NULL COMMENT '合同类型',
    ->     contract_business_type    VARCHAR(32)    NOT NULL COMMENT '合同业务类型',
    ->     lcy_currency_code         VARCHAR(32) COMMENT 'LCY货币符号',
    ->     effective_date            DATETIME COMMENT '生效日期',
    ->     LCY_TCV                   DECIMAL(24, 6) COMMENT 'LCY_TCV',
    ->     LCY_SaaS_TCV              DECIMAL(24, 6) COMMENT 'LCY_SaaS_TCV',
    ->     LCY_SaaS_ACV              DECIMAL(24, 6) COMMENT 'LCY_SaaS_ACV',
    ->     contract_period           DECIMAL(24, 6) COMMENT '合同年限',
    ->     product_id                BIGINT         NOT NULL COMMENT '产品id',
    ->     op_serial                 INT            NOT NULL COMMENT '同一订单内相同产品不同明细的编号',
    ->     product_type              VARCHAR(128) COMMENT '产品类型',
    ->     product_catalogue1        VARCHAR(128) COMMENT '产品一级目录',
    ->     product_account_type2     VARCHAR(128) COMMENT '产品版本分类',
    ->     product_line              VARCHAR(128) COMMENT '辅助产品分类',
    ->     product_name              VARCHAR(256) COMMENT ' 产品名称 ',
    ->     product_cjj               DECIMAL(24, 6) NOT NULL COMMENT '产品成交价-本币',
    ->     `unit_quantity`           DECIMAL(24, 6) NOT NULL COMMENT '服务年限<unitQuantity>',
    ->     `terminate_order_no`      VARCHAR(255) COMMENT '中止_合同编号',
    ->     termination_date          DATE COMMENT '中止日期',
    ->     termination_product_cjj   DECIMAL(24, 6) COMMENT '中止_产品成交价',
    ->     termination_unit_quantity DECIMAL(24, 6) COMMENT '中止_服务年限',
    ->     actual_product_cjj_lcy    DECIMAL(24, 6) COMMENT 'LCY_实际_产品成交价',
    ->     actual_unit_quantity      DECIMAL(24, 6) COMMENT '实际_服务年限',
    ->     actual_services_months    INT COMMENT '实际_服务月数',
    ->     monthly_acv_lcy           DECIMAL(24, 6) COMMENT 'LCY_每月ACV',
    ->     mty_start_date            DATE COMMENT 'MTY_开始日期',
    ->     mty_end_date              DATE COMMENT 'MTY_结束日期',
    ->     PRIMARY KEY idx_multi_year_acv_lcy (contract_no, account_id, product_id, op_serial)
    -> ) COMMENT '基础层-Global_Multi-year_LCY(东方部分)';
Query OK, 0 rows affected (0.15 sec)

(root@127.0.0.1) [fin_dws]>SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
    -> WHERE data_source = 'EAST';
ERROR 1054 (42S22): Unknown column 'order_id' in 'field list'

你这个表是没有 order_id 列啊。

我想知道问题是啥 :thinking:,是觉得执行计划有问题,还是结果不正确或者说是 SQL 报错了。

应该表述的是,单独执行的时候会报错每列,带入到子查询后不报错了

应该报错的,不过这个地方应该是因为 Deferred Name Resolution 导致的。数据库可能不会在子查询中立即检查列的存在性,特别是如果子查询没有实际执行(由于优化或者其他原因)

我测试了下 MySQL 也没有报错。

这个跟tidb或者mysql没关系,是你的sql_mode开启了NO_ENGINE_SUBSTITUTION非严格模式
SHOW GLOBAL VARIABLES LIKE ‘sql_mode’;
将NO_ENGINE_SUBSTITUTION去掉重新设置写sql_mode就会报错了。
SET sql_mode=‘ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER’;

没太明白是什么问题。。。TiDB 最新版本试了一下最下面的sql和mysql是一致的

测试了这个方法, tidb也没报错 …


SET sql_mode='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO';
SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         WHERE order_id NOT IN (
             SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         )
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
           LIMIT 10

抱歉,上面回复有误,应该是子查询里面的order_id被当成了外面表的order_id 字段,所以没报错,可以改成这样就会报错
SELECT account_id,
acv_gsr_year AS gs_year,
contract_no,
lcy_total_acv AS acv2
FROM fin_dws.dws_fin2_global_acv_1d t
WHERE order_id NOT IN (
SELECT a.order_id FROM fin_dws.dws_fin_east_multi_year_acv_lcy a
WHERE data_source = ‘EAST’
)
AND acv_gsr_year IN (2023, 2024)
AND t.type = ‘EAST’
LIMIT 10

嗯. 是这样, 这个很奇怪 :sweat_smile: 居然解析到外层去了

感谢帅哥, 是我对SQL 语法解析不清楚. 字段找不到就到外层去找, 写这种代码得谨慎了…

多谢多谢.

多表最好是加别名,字段名加别名.字段