在 Kubernetes 集群上使用 Lightning 导入数据失败

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】
v4.0.10
【问题描述】

在 Kubernetes 集群上使用 Lightning 导入数据失败
应该是没有配置密码,但是不知道如何配置


配置文件如下:

Default values for tidb-lightning.

This is a YAML-formatted file.

Declare variables to be passed into your templates.

timezone is the default system timzone

timezone: UTC

image: pingcap/tidb-lightning:v4.0.10
imagePullPolicy: IfNotPresent

imagePullSecrets: []

service:
type: NodePort

failFast causes the lightning pod fails when any error happens.

when disabled, the lightning pod will keep running when error happens to allow manual intervention, users have to check logs to see the job status.

failFast: true

dataSource:

for local source, the nodeName should be the label value of kubernetes.io/hostname.

local:
nodeName: node4
hostPath: /apps/data/tpcc/csv50G

The backup data is on a PVC which is exported and unarchived from tidb-backup or scheduled backup.

Note: when using this mode, the lightning needs to be deployed in the same namespace as the PVC

and the targetTidbCluster.namespace needs to be configured explicitly

adhoc: {}
# pvcName: tidb-cluster-scheduled-backup
# backupName: scheduled-backup-20190822-041004
remote: {}
# rcloneImage: pingcap/tidb-cloud-backup:20200229
# storageClassName: local-storage
# storage: 100Gi
# secretName: cloud-storage-secret
# path: s3:bench-data-us/sysbench/sbtest_16_1e7.tar.gz
# Directory support downloading all files in a remote directory, shadow dataSoure.remote.path if present
# directory: s3:bench-data-us
# If rcloneConfig is configured, then secretName will be ignored,
# rcloneConfig should only be used for the cases where no sensitive
# information need to be configured, e.g. the configuration as below,
# the Pod will get the credentials from the infrastructure.
#rcloneConfig: |
# [s3]
# type = s3
# provider = AWS
# env_auth = true
# region = us-west-2

targetTidbCluster:
name: a8c8a834c17

namespace is the target tidb cluster namespace, can be omitted if the lightning is deployed in the same namespace of the target tidb cluster

namespace: “cidcucd8e01a325ea4eb0bcf0b83f973d5d9f”
user: root

If the secretName and secretUserKey are set,

the user will be ignored and the user in the

secretName will be used by lightning.

If the secretName and secretPwdKey are set, the

password in the secretName will be used by lightning.

tls=“false”

tlsCluster: {}

enabled: true

Whether enable the TLS connection with the TiDB MySQL protocol port.

if enabled, a Secret named tlsClientSecretName (if specified) or ${targetTidbCluster.name}-tidb-client-secret must exist.

To create this Secret: kubectl create secret generic ${targetTidbCluster.name}-tidb-client-secret --namespace= --from-file=tls.crt=<path/to/tls.crt> --from-file=tls.key=<path/to/tls.key> --from-file=ca.crt=<path/to/ca.crt>

tlsClient: {}

enabled: true

tlsClientSecretName: ${targetTidbCluster.name}-tidb-client-secret

resources:
limits:
cpu: 4000m
memory: 32Gi
requests:
cpu: 4000m
memory: 32Gi

nodeSelector: {}

kubernetes.io/hostname: node4

annotations: {}

tolerations: []
affinity: {}

The delivery backend used to import data (valid options include importer, local and tidb).

If set to local, then the following sortedKV should be set.

backend: local

For local backend, an extra PV is needed for local KV sorting.

sortedKV:
storageClassName: lvm-hostpath
storage: 100Gi

Specify a Service Account for lightning

serviceAccount:

For TiDB-Lightning v3.0.18+ or v4.0.3+, if you want to log to stdout, you should set file = "-".

If you want to store the checkpoint under source data directory, you should set CHECKPOINT_USE_DATA_DIR as the prefix of dsn.

If you do not update the CHECKPOINT_USE_DATA_DIR in the following dsn field, it will be replaced automatically in the startup script with the following rules:

for local data source, CHECKPOINT_USE_DATA_DIR is replaced by hostPath;

for adhoc data source, CHECKPOINT_USE_DATA_DIR is replaced by the path to backupName (pvcName if backupName is not specified) in pvcName PVC;

for remote data source, CHECKPOINT_USE_DATA_DIR is replaced by the path of the backup data in the PVC requested by this chart.

config: |
[lightning]
level = “info”
file = “-”
[checkpoint]
enable = true
driver = “file”
dsn = “CHECKPOINT_USE_DATA_DIR/tidb_lightning_checkpoint.pb”
keep-after-success = false
[mydumper]
no-schema = true
batch-size = 21_474_836_480
strict-format = true
max-region-size = 33_554_432
filter = [‘.’]
[mydumper.csv]
separator = ‘,’
delimiter = ‘’
header = false
not-null = false
null = ‘NULL’
backslash-escape = true
trim-last-separator = false


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

看配置使用的本地模式吗?
https://docs.pingcap.com/zh/tidb-in-kubernetes/stable/restore-data-using-tidb-lightning#本地模式

搞好了,问题在于没有配置tidb地址密码

:call_me_hand:


建议把这个更新一下
https://book.tidb.io/session2/chapter1/tidb-operator-lightning.html
这上面很多写的不清楚

能否具体说下哪些不清楚?
https://docs.pingcap.com/zh/tidb/dev/tidb-lightning-configuration

这个是非k8s的,k8s里的文档没有这个介绍

建议优先以官网为准,其他文档有可能更新不及时

https://book.tidb.io/session2/chapter1/tidb-operator-lightning.html 这个目前是怎么更新的?确实缺少 k8s 下面怎么操作,很难将 lightning 的配置和 lightning on k8s 使用联系起来 @yilong

@handlerww lightning on k8s 的文档帮一下忙哦