tiflash 告警:【TiDB ERR】[emergency]TiFlash_schema_error

tidb 版本:
5.7.25-TiDB-v5.2.1

收到tidb 告警:
【TiDB ERR】[emergency]TiFlash_schema_error

tiflash 日志报错:
2022.04.02 15:50:34.870170 [ 10 ] SchemaSyncService: DB::SchemaSyncService::SchemaSyncService(DB::Context&)::<lambda()>: Sync schemas failed by basic_string::_M_replace_aux
2022.04.02 15:50:34.900537 [ 11 ] SchemaSyncer: apply diff meets exception : DB::TiFlashException: miss table in TiKV : 311
stack is 0. bin/tiflash/tiflash(StackTrace::StackTrace()+0x16) [0x3921fe6]

  1. bin/tiflash/tiflash(DB::TiFlashException::TiFlashException(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, DB::TiFlashError const&)+0x35) [0x4160365]
  2. bin/tiflash/tiflash(DB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyAlterTable(std::shared_ptrTiDB::DBInfo, long)+0x171) [0x7e7f7f1]
  3. bin/tiflash/tiflash(DB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyDiff(DB::SchemaDiff const&)+0x1c4) [0x7e7fb64]
  4. bin/tiflash/tiflash(DB::TiDBSchemaSyncer::tryLoadSchemaDiffs(DB::SchemaGetter&, long, DB::Context&)+0x1e3) [0x7ae42d3]
  5. bin/tiflash/tiflash(DB::TiDBSchemaSyncer::syncSchemas(DB::Context&)+0x3fa) [0x7ae4e1a]
  6. bin/tiflash/tiflash(DB::SchemaSyncService::syncSchemas()+0x2f) [0x7ad2b7f]
  7. bin/tiflash/tiflash() [0x7ad52e1]
  8. bin/tiflash/tiflash(DB::BackgroundProcessingPool::threadFunction()+0x947) [0x7991427]
  9. bin/tiflash/tiflash() [0x8e571bf]
  10. /lib64/libpthread.so.0(+0x7ea5) [0x7f1d48e67ea5]
  11. /lib64/libc.so.6(clone+0x6d) [0x7f1d4888e9fd]

业务反馈查询报错:
MySQL > select * from xxxx limit 1;
ERROR 1105 (HY000): basic_string::_M_replace_aux

2赞

近期有没有rename表名的ddl

没有操作

能否发下告警对应时间点 TiFlash 和 TiDB 的日志呢?

上面就是tiflash 日志显示的内容,tidb log 没有异常报错

能否发下日志文件呢?会有更多的上下文信息

刚才没保存,我把tiflash 先给下线了,想重新部署试试看,结果部署也失败,tiflash 日志显示022.04.02 17:06:09.511406 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2022.04.02 17:06:15.572535 [ 1 ] Application: basic_string::_M_replace_aux
2022.04.02 17:06:31.640767 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2022.04.02 17:06:37.881039 [ 1 ] Application: basic_string::_M_replace_aux
2022.04.02 17:06:53.973737 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2022.04.02 17:07:00.214609 [ 1 ] Application: basic_string::_M_replace_aux
2022.04.02 17:07:16.357215 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2022.04.02 17:07:22.595234 [ 1 ] Application: basic_string::_M_replace_aux
2022.04.02 17:07:38.674409 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2022.04.02 17:07:44.909858 [ 1 ] Application: basic_string::_M_replace_aux

1赞

能否看下 tiflash.log 呢

tiflash.log (3.9 MB) 您好这是刚刚部署tiflash 报错日志

1赞

看日志怀疑是 db_252.t_330 这个表有问题,请问是否方便提供下建表语句呢?

没有这个库表

wdapk.apk_icon_sample 这个呢:joy:

有这个表,当时业务反馈确实新建的这张表,是有什么问题吗,同时我发现一个奇怪的现象,我又部署相同版本的集群,都有个 check_time datetime NOT NULL DEFAULT ‘0000-00-00 00:00:00’,字段,新部署的就会报错了,默认值不能为’0000’,旧集群还是可以创建的

CREATE TABLE apk_icon_sample (
id int(16) NOT NULL AUTO_INCREMENT,
apk_md5 varchar(32) NOT NULL DEFAULT ‘’,
check_time datetime NOT NULL DEFAULT ‘0000-00-00 00:00:00’,
apk_remark varchar(100) NOT NULL DEFAULT ‘’,
apk_state varchar(100) NOT NULL DEFAULT ‘105’,
apk_type varchar(100) NOT NULL DEFAULT ‘’,
add_time datetime NOT NULL DEFAULT ‘0000-00-00 00:00:00’,
add_user varchar(50) NOT NULL DEFAULT ‘’,
update_user varchar(50) NOT NULL DEFAULT ‘’,
update_time datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
state varchar(10) NOT NULL DEFAULT ‘0’,
icon_md5 varchar(32) NOT NULL DEFAULT ‘’,
icon longblob NOT NULL,
PRIMARY KEY (id) /*T![clustered_index] CLUSTERED */,
UNIQUE KEY idx_uniq (icon_md5),
KEY idx_update_time (update_time)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=90001

新旧集群都是 5.2.1版本的吗?

相同版本,您可以部署一个nightly 版本测试下上面语句,同时昨天我又部署了一个5.2.2版本的集群,测试是不是版本问题导致的,把旧集群的原先使用tiflash大表迁移过去了,然后创建tiflash,创建完成之后一段时间没问题,然后tiflash又挂了

tiflash.log (3.9 MB)

新部署时的报错是什么呢?ERROR 1067 (42000): Invalid default value for ‘check_time’ 这个吗?

对是这个报错

请问新集群和老集群建表时用的客户端是一样的吗?