lightning导入数据 mydumper dir does not exist

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v3.1.0-beta.1
  • 【问题描述】:尝试使用tidb lightning服务快速导入数据。一开始导入数据文件A时,顺利。接下来再导入数据文件B时,分别修改了group_vars/lightning_server.yml的data_source_dir和lightning所在服务器的tidb-lightning.toml中的data-source-dir。并确认修改路径正确。此时,再运行lightning时,出现问题,报错信息如下:

[2020/03/26 23:38:19.974 +08:00] [WARN] [config.go:288] [“currently only per-task configuration can be applied, global configuration changes can only be made on startup”] [“global config changes”="[lightning.file,lightning.level,lightning.max-backups,lightning.max-days,lightning.max-size,lightning.pprof-port,tidb.log-level]"]

[2020/03/26 23:38:19.975 +08:00] [INFO] [version.go:48] [“Welcome to lightning”] [“Release Version”=v3.1.0-beta.1] [“Git Commit Hash”=605760d1b2025d1e1a8b7d0c668c74863d7d1271] [“Git Branch”=HEAD] [“UTC Build Time”=“2020-01-10 12:16:24”] [“Go Version”=“go version go1.12 linux/amd64”]

[2020/03/26 23:38:19.975 +08:00] [INFO] [lightning.go:165] [cfg] [cfg="{“id”:1585237099975036558,“lightning”:{“table-concurrency”:6,“index-concurrency”:2,“region-concurrency”:32,“io-concurrency”:5,“check-requirements”:true},“tidb”:{“host”:“10.12.5.233”,“port”:4000,“user”:“root”,“status-port”:10080,“pd-addr”:“10.12.5.234:2379”,“sql-mode”:“ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION”,“max-allowed-packet”:67108864,“distsql-scan-concurrency”:100,“build-stats-concurrency”:20,“index-serial-scan-concurrency”:20,“checksum-table-concurrency”:16},“checkpoint”:{“enable”:true,“schema”:“tidb_lightning_checkpoint”,“driver”:“file”,“keep-after-success”:false},“mydumper”:{“read-block-size”:65536,“batch-size”:107374182400,“batch-import-ratio”:0,“data-source-dir”:"/home/daslab/curve",“no-schema”:false,“character-set”:“auto”,“csv”:{“separator”:",",“delimiter”:"\"",“header”:true,“trim-last-separator”:false,“not-null”:false,“null”:"\\N",“backslash-escape”:true},“case-sensitive”:false},“black-white-list”:{“do-tables”:null,“do-dbs”:null,“ignore-tables”:null,“ignore-dbs”:[“mysql”,“information_schema”,“performance_schema”,“sys”]},“tikv-importer”:{“addr”:“10.12.5.112:8287”,“backend”:“importer”,“on-duplicate”:“replace”},“post-restore”:{“level-1-compact”:false,“compact”:false,“checksum”:true,“analyze”:true},“cron”:{“switch-mode”:“5m0s”,“log-progress”:“5m0s”},“routes”:null}"]

[2020/03/26 23:38:19.975 +08:00] [INFO] [lightning.go:194] [“load data source start”]

[2020/03/26 23:38:19.975 +08:00] [ERROR] [lightning.go:197] [“load data source failed”] [takeTime=109.047µs] [error="/home/daslab/curve: mydumper dir does not exist"]

[2020/03/26 23:38:19.975 +08:00] [ERROR] [main.go:59] [“tidb lightning encountered error”] [error="/home/daslab/curve: mydumper dir does not exist"] [errorVerbose="/home/daslab/curve: mydumper dir does not exist\ngithub.com/pingcap/tidb-lightning/lightning/mydump.(*mdLoaderSetup).setup\n\t/home/jenkins/agent/workspace/release_tidb_3.1/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:163\ngithub.com/pingcap/tidb-lightning/lightning/mydump.NewMyDumpLoader\n\t/home/jenkins/agent/workspace/release_tidb_3.1/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:105\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).run\n\t/home/jenkins/agent/workspace/release_tidb_3.1/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:196\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).RunOnce\n\t/home/jenkins/agent/workspace/release_tidb_3.1/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:138\nmain.main\n\t/home/jenkins/agent/workspace/release_tidb_3.1/go/src/github.com/pingcap/tidb-lightning/cmd/tidb-lightning/main.go:56\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]

更奇怪的是,如果我把当前待上传文件的文件名修改为原始data-source-dir。log信息显示能够上传,但上传后的数据仍然是上一个数据文件A的数据。很奇怪。

希望您能帮我解决以下问题: 1)如何成功上传数据? 2)下一次如果我要上传新的数据,我需要在配置文件上做什么修改?

@ gangshen-PingCAP

已解决

好的,方便解释一下问题原因以及解决方式吗?