TISPARK TiFlash 针对时间分区等值下推不正确

[TiDB 版本] 4.0.10
[spark 版本] 2.4.0
[TiSpark jar包] tispark-assembly-2.3.11.jar

[问题描述]
建表
CREATE TABLE inspector_batch_sessions (
session_id varchar(64) NOT NULL,
app_name varchar(64) NOT NULL,
dt timestamp NOT NULL COMMENT ‘对话日期’,
batch_id bigint(13) NOT NULL,
created_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ‘创建时间’,
updated_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT ‘更新时间’,
primary key (dt, app_name, batch_id, session_id)
)
PARTITION BY RANGE(UNIX_TIMESTAMP(dt)) (
PARTITION dt_0 VALUES LESS THAN (0)
);

在使用如下参数进入 spark-shell
–conf spark.sql.extensions=org.apache.spark.sql.TiExtensions
–conf spark.tispark.isolation_read_engines=tiflash
–conf spark.tispark.db_prefix=tidb_

spark-shell 执行结果
scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘2021-02-22’”).show
±-------+
|count(1)|
±-------+
| 20968|
±-------+

tidb 执行结果
mysql> select count(1) from inspector_batch_sessions where dt = ‘2021-02-22’;
±---------+
| count(1) |
±---------+
| 580596 |
±---------+

两次执行结果差距较大
目前的现象是 --conf spark.tispark.isolation_read_engines=tikv 没问题, 且数据条数在 2-3万以下没问题, 3万以上两边的执行结果就会出现误差

1 个赞

您好,请问您有安装 tidb-server 吗? 如果有的话,使用tidb查询 tikv 和tiflash 的值是否相同呢?
麻烦先帮忙确认下 tikv 和 tiflash 中的数据是否同步完成,相同呢?

tidb-server 是安装过的
tikv 和 tiflash 中的数据已经同步完成
分别制定 tikv 和 tiflush 查询出来的值也是一样的, 如下

mysql> set @@session.tidb_isolation_read_engines = “tikv, tidb”;

Query OK, 0 rows affected (0.00 sec)

mysql> select count(1) from inspector_batch_sessions where dt = ‘2021-02-22’;

±---------+

| count(1) |

±---------+

| 580596 |

±---------+

1 row in set (0.53 sec)

mysql> set @@session.tidb_isolation_read_engines = “tiflash”;

Query OK, 0 rows affected (0.00 sec)

mysql> select count(1) from inspector_batch_sessions where dt = ‘2021-02-22’;

±---------+

| count(1) |

±---------+

| 580596 |

±---------+

1 row in set (0.06 sec)

1 个赞

抱歉还在分析中,有进展回尽快回复,多谢。

能否分别在tidb和tispark里看一下这个sql的plan?
explain select count(1) from inspector_batch_sessions where dt = ‘2021-02-22’;

tidb sql plan

mysql> explain select count(1) from inspector_batch_sessions where dt = ‘2021-02-22’;
±-----------------------------±----------±-------------±------------------------------------------------------±-----------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
±-----------------------------±----------±-------------±------------------------------------------------------±-----------------------------------------------------------------------+
| StreamAgg_28 | 1.00 | root | | funcs:count(Column#10)->Column#8 |
| └─TableReader_29 | 1.00 | root | | data:StreamAgg_10 |
| └─StreamAgg_10 | 1.00 | cop[tiflash] | | funcs:count(1)->Column#10 |
| └─Selection_27 | 580682.00 | cop[tiflash] | | eq(tb_inspect.inspector_batch_sessions.dt, 2021-02-22 00:00:00.000000) |
| └─TableFullScan_26 | 580682.00 | cop[tiflash] | table:inspector_batch_sessions, partition:dt_20210222 | keep order:false |
±-----------------------------±----------±-------------±------------------------------------------------------±-----------------------------------------------------------------------+
5 rows in set (0.00 sec)

tispark plan

scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘2021-02-22’”).explain

== Physical Plan ==

*(2) HashAggregate(keys=, functions=[count(1)])

± Exchange SinglePartition

± *(1) HashAggregate(keys=, functions=[partial_count(1)])

± *(1) Project

± *(1) partition table[

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210201([t\200\000\000\000\000\000\f\341_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\341_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210202([t\200\000\000\000\000\000\f\342_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\342_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210203([t\200\000\000\000\000\000\f\343_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\343_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210204([t\200\000\000\000\000\000\f\344_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\344_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210205([t\200\000\000\000\000\000\f\345_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\345_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210206([t\200\000\000\000\000\000\f\346_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\346_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210207([t\200\000\000\000\000\000\f\347_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\347_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210208([t\200\000\000\000\000\000\f\350_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\350_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210209([t\200\000\000\000\000\000\f\351_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\351_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210210([t\200\000\000\000\000\000\f\352_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\352_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210211([t\200\000\000\000\000\000\f\353_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\353_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210212([t\200\000\000\000\000\000\f\354_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\354_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210213([t\200\000\000\000\000\000\f\355_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\355_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210214([t\200\000\000\000\000\000\f\356_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\356_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210215([t\200\000\000\000\000\000\f\357_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\357_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210216([t\200\000\000\000\000\000\f\360_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\360_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210217([t\200\000\000\000\000\000\f\361_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\361_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210218([t\200\000\000\000\000\000\f\362_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\362_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210219([t\200\000\000\000\000\000\f\363_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\363_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210220([t\200\000\000\000\000\000\f\364_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\364_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210221([t\200\000\000\000\000\000\f\365_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\365_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210222([t\200\000\000\000\000\000\f\366_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\366_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210223([t\200\000\000\000\000\000\f\367_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\367_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210224([t\200\000\000\000\000\000\f\370_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\370_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210225([t\200\000\000\000\000\000\f\371_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\371_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210226([t\200\000\000\000\000\000\f\372_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\372_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210227([t\200\000\000\000\000\000\f\373_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\373_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210228([t\200\000\000\000\000\000\f\374_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\f\374_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210301([t\200\000\000\000\000\000\030\327_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\327_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210302([t\200\000\000\000\000\000\030\331_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\331_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210303([t\200\000\000\000\000\000\030\333_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\333_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210304([t\200\000\000\000\000\000\030\344_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\344_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210305([t\200\000\000\000\000\000\030\346_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\346_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210306([t\200\000\000\000\000\000\030\350_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\350_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210307([t\200\000\000\000\000\000\030\352_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\352_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210308([t\200\000\000\000\000\000\030\354_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\354_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210309([t\200\000\000\000\000\000\030\356_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\356_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210310([t\200\000\000\000\000\000\030\360_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\360_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210311([t\200\000\000\000\000\000\030\363_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\363_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210312([t\200\000\000\000\000\000\030\365_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\365_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210313([t\200\000\000\000\000\000\030\367_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\367_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210314([t\200\000\000\000\000\000\030\371_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\371_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210315([t\200\000\000\000\000\000\030\373_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\373_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210316([t\200\000\000\000\000\000\030\375_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\375_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210317([t\200\000\000\000\000\000\030\377_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\030\377_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210318([t\200\000\000\000\000\000\031\001_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\001_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210319([t\200\000\000\000\000\000\031\003_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\003_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210320([t\200\000\000\000\000\000\031\005_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\005_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210321([t\200\000\000\000\000\000\031\a_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\a_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210322([t\200\000\000\000\000\000\031\t_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\t_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210323([t\200\000\000\000\000\000\031\v_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\v_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210324([t\200\000\000\000\000\000\031\r_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\r_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210325([t\200\000\000\000\000\000\031\017_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\017_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210326([t\200\000\000\000\000\000\031\021_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\021_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210327([t\200\000\000\000\000\000\031\023_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\023_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210328([t\200\000\000\000\000\000\031\025_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\025_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210329([t\200\000\000\000\000\000\031\027_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\027_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210330([t\200\000\000\000\000\000\031\031_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\031_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

TiFlash CoprocessorRDD{[table: inspector_batch_sessions] TableScan, Columns: dt@TIMESTAMP, Residual Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], PushDown Filter: [dt@TIMESTAMP EQUAL 2021-02-22 00:00:00.0], KeyRange: [ partition: dt_20210331([t\200\000\000\000\000\000\031\033_r\000\000\000\000\000\000\000\000], [t\200\000\000\000\000\000\031\033_s\000\000\000\000\000\000\000\000])], Aggregates: , startTs: 423271727502983179}

]

1 个赞

请问能分别查下 2021-02-21 和02021-02-23 的结果吗?

查 tispark 的那天数据条数吗?

tidb 执行
mysql> select count(1) from inspector_batch_sessions where dt = ‘2021-02-21’;

±---------+

| count(1) |

±---------+

| 851216 |

±---------+

1 row in set (0.08 sec)

mysql> select count(1) from inspector_batch_sessions where dt = ‘02021-02-23’;

±---------+

| count(1) |

±---------+

| 644511 |

±---------+

1 row in set (0.08 sec)

mysql> select count(1) from inspector_batch_sessions where dt = ‘2021-02-23’;

±---------+

| count(1) |

±---------+

| 644511 |

±---------+

1 row in set (0.07 sec)

tispark 执行
scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘2021-02-21’”).show
±-------+
|count(1)|
±-------+
| 52326|
±-------+

scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘2021-02-23’”).show
±-------+
|count(1)|
±-------+
| 4884|
±-------+

scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘02021-02-23’”).show
±-------+
|count(1)|
±-------+
| 0|
±-------+

1 个赞

select count(1) from inspector_batch_sessions where dt = date’2021-02-22’;
能否用上面的sql在tispark里执行一下?

1 个赞

scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = date’2021-02-22’”).show
±-------+
|count(1)|
±-------+
| 20968|
±-------+

不好意思,刚才语句有点错误

select count(1) from inspector_batch_sessions where to_date(dt) = date’2021-02-22’;

麻烦再试一下,感谢

scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where to_date(dt) = date’2021-02-22’”).show
±-------+
|count(1)|
±-------+
| 580596|
±-------+

to_date 这个算子会导致下推的时候会scan 全表吧

目前 tispark 还不支持这种类型分区的下推,RANGE(UNIX_TIMESTAMP(dt))

https://github.com/pingcap/tispark/blob/master/docs/userguide.md#reading-partition-table-from-tidb

问下之后会针对这种做下推优化吗? 或者说有什么可以使用tispark 这种情况
RANGE(UNIX_TIMESTAMP(dt)) 下推到分区的解决办法吗? 目前我们使用tispark查询数据会由于分区越来越多会变慢

收到需求,后续我们安排开发支持一下

tispark 在使用 --conf spark.tispark.isolation_read_engines=tikv 时能查询正确结果, 那个差距的是 tiflash 模式
问下这个是什么原因?
scala> spark.sql(“select count(1) from tidb_tb_inspect.inspector_batch_sessions where dt = ‘2021-02-22’”).show
±-------+
|count(1)|
±-------+
| 580596|
±-------+

这个应该是tispark的问题,可以先用上面的方法绕过,后面有进展我在更新,多谢。

好的, 多谢, 我目前是通过 dt >= ‘2021-02-22’ AND dt < ‘2021-02-23’ 绕过的, 只是会全表scan