我們?cè)跀?shù)據(jù)庫(kù)運(yùn)維過程中難免會(huì)涉及到需要對(duì)文本數(shù)據(jù)進(jìn)行處理,并導(dǎo)入到數(shù)據(jù)庫(kù)中,本文整理了一些導(dǎo)入導(dǎo)出時(shí)常見的場(chǎng)景進(jìn)行示例演示。
文章后續(xù)示例均使用以下命令導(dǎo)出的 csv 格式樣例數(shù)據(jù)(以 , 逗號(hào)做分隔符,以 " 雙引號(hào)作為界定符)
-- 導(dǎo)出基礎(chǔ)參數(shù) select * into outfile '/data/mysql/3306/tmp/employees.txt' character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' from employees.employees limit 10; -- 導(dǎo)入基礎(chǔ)參數(shù) load data infile '/data/mysql/3306/tmp/employees.txt' replace into table demo.emp character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' ...
以下為示例數(shù)據(jù),表結(jié)構(gòu)及對(duì)應(yīng)關(guān)系信息
-- 導(dǎo)出的文件數(shù)據(jù)內(nèi)容 [root@10-186-61-162 tmp]# cat employees.txt "10001","1953-09-02","Georgi","Facello","M","1986-06-26" "10002","1964-06-02","Bezalel","Simmel","F","1985-11-21" "10003","1959-12-03","Parto","Bamford","M","1986-08-28" "10004","1954-05-01","Chirstian","Koblick","M","1986-12-01" "10005","1955-01-21","Kyoichi","Maliniak","M","1989-09-12" "10006","1953-04-20","Anneke","Preusig","F","1989-06-02" "10007","1957-05-23","Tzvetan","Zielinski","F","1989-02-10" "10008","1958-02-19","Saniya","Kalloufi","M","1994-09-15" "10009","1952-04-19","Sumant","Peac","F","1985-02-18" "10010","1963-06-01","Duangkaew","Piveteau","F","1989-08-24" -- 示例表結(jié)構(gòu) SQL > desc demo.emp; +-------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------+---------------+------+-----+---------+-------+ | emp_no | int | NO | PRI | NULL | | | birth_date | date | NO | | NULL | | | first_name | varchar(16) | NO | | NULL | | | last_name | varchar(16) | NO | | NULL | | | fullname | varchar(32) | YES | | NULL | | -- 表新增字段,導(dǎo)出數(shù)據(jù)文件中不存在 | gender | enum('M','F') | NO | | NULL | | | hire_date | date | NO | | NULL | | | modify_date | datetime | YES | | NULL | | -- 表新增字段,導(dǎo)出數(shù)據(jù)文件中不存在 | delete_flag | char(1) | YES | | NULL | | -- 表新增字段,導(dǎo)出數(shù)據(jù)文件中不存在 +-------------+---------------+------+-----+---------+-------+ -- 導(dǎo)出的數(shù)據(jù)與字段對(duì)應(yīng)關(guān)系 emp_no birth_date first_name last_name gender hire_date "10001" "1953-09-02" "Georgi" "Facello" "M" "1986-06-26" "10002" "1964-06-02" "Bezalel" "Simmel" "F" "1985-11-21" "10003" "1959-12-03" "Parto" "Bamford" "M" "1986-08-28" "10004" "1954-05-01" "Chirstian" "Koblick" "M" "1986-12-01" "10005" "1955-01-21" "Kyoichi" "Maliniak" "M" "1989-09-12" "10006" "1953-04-20" "Anneke" "Preusig" "F" "1989-06-02" "10007" "1957-05-23" "Tzvetan" "Zielinski" "F" "1989-02-10" "10008" "1958-02-19" "Saniya" "Kalloufi" "M" "1994-09-15" "10009" "1952-04-19" "Sumant" "Peac" "F" "1985-02-18" "10010" "1963-06-01" "Duangkaew" "Piveteau" "F" "1989-08-24"
場(chǎng)景1. LOAD 文件中的字段比數(shù)據(jù)表中的字段多
只需要文本文件中部分?jǐn)?shù)據(jù)導(dǎo)入到數(shù)據(jù)表中
-- 臨時(shí)創(chuàng)建2個(gè)字段的表結(jié)構(gòu) SQL > create table emp_tmp select emp_no,hire_date from emp; SQL > desc emp_tmp; +-----------+------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+------+------+-----+---------+-------+ | emp_no | int | NO | | NULL | | | hire_date | date | NO | | NULL | | +-----------+------+------+-----+---------+-------+ -- 導(dǎo)入數(shù)據(jù)語(yǔ)句 load data infile '/data/mysql/3306/tmp/employees.txt' replace into table demo.emp_tmp character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' (@C1,@C2,@C3,@C4,@C5,@C6) -- 該部分對(duì)應(yīng)employees.txt文件中6列數(shù)據(jù) -- 只對(duì)導(dǎo)出數(shù)據(jù)中指定的2個(gè)列與表中字段做匹配,mapping關(guān)系指定的順序不影響導(dǎo)入結(jié)果 set hire_date=@C6, emp_no=@C1; -- 導(dǎo)入數(shù)據(jù)結(jié)果示例 SQL > select * from emp_tmp; +--------+------------+ | emp_no | hire_date | +--------+------------+ | 10001 | 1986-06-26 | | 10002 | 1985-11-21 | | 10003 | 1986-08-28 | | 10004 | 1986-12-01 | | 10005 | 1989-09-12 | | 10006 | 1989-06-02 | | 10007 | 1989-02-10 | | 10008 | 1994-09-15 | | 10009 | 1985-02-18 | | 10010 | 1989-08-24 | +--------+------------+ 10 rows in set (0.0016 sec)
場(chǎng)景 2. LOAD 文件中的字段比數(shù)據(jù)表中的字段少
表字段不僅包含文本文件中所有數(shù)據(jù),還包含了額外的字段
-- 導(dǎo)入數(shù)據(jù)語(yǔ)句 load data infile '/data/mysql/3306/tmp/employees.txt' replace into table demo.emp character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' (@C1,@C2,@C3,@C4,@C5,@C6) -- 該部分對(duì)應(yīng)employees.txt文件中6列數(shù)據(jù) -- 將文件中的字段與表中字段做mapping對(duì)應(yīng),表中多出的字段不做處理 set emp_no=@C1, birth_date=@C2, first_name=@C3, last_name=@C4, gender=@C5, hire_date=@C6;
場(chǎng)景3. LOAD 生成自定義字段數(shù)據(jù)
從場(chǎng)景 2 的驗(yàn)證可以看到,emp 表中新增的字段
fullname,modify_date,delete_flag
字段在導(dǎo)入時(shí)并未做處理,被置為了 NULL 值,如果需要對(duì)其進(jìn)行處理,可在 LOAD 時(shí)通過MySQL支持的函數(shù)
或給定固定值
自行定義數(shù)據(jù),對(duì)于文件中存在的字段也可做函數(shù)處理,結(jié)合導(dǎo)入導(dǎo)出,實(shí)現(xiàn)簡(jiǎn)單的 ETL 功能,如下所示:
-- 導(dǎo)入數(shù)據(jù)語(yǔ)句 load data infile '/data/mysql/3306/tmp/employees.txt' replace into table demo.emp character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' (@C1,@C2,@C3,@C4,@C5,@C6) -- 該部分對(duì)應(yīng)employees.txt文件中6列數(shù)據(jù) -- 以下部分明確對(duì)表中字段與數(shù)據(jù)文件中的字段做Mapping關(guān)系,不存在的數(shù)據(jù)通過函數(shù)處理生成(也可設(shè)置為固定值) set emp_no=@C1, birth_date=@C2, first_name=upper(@C3), -- 將導(dǎo)入的數(shù)據(jù)轉(zhuǎn)為大寫 last_name=lower(@C4), -- 將導(dǎo)入的數(shù)據(jù)轉(zhuǎn)為小寫 fullname=concat(first_name,' ',last_name), -- 對(duì)first_name和last_name做拼接 gender=@C5, hire_date=@C6 , modify_date=now(), -- 生成當(dāng)前時(shí)間數(shù)據(jù) delete_flag=if(hire_date'1988-01-01','Y','N'); -- 對(duì)需要生成的值基于某一列做條件運(yùn)算
場(chǎng)景4. LOAD 定長(zhǎng)數(shù)據(jù)
定長(zhǎng)數(shù)據(jù)的特點(diǎn)如下所示,可以使用函數(shù)取出字符串中固定長(zhǎng)度來生成指定列數(shù)據(jù)
SQL > select c1 as sample_data, substr(c1,1,3) as c1, substr(c1,4,3) as c2, substr(c1,7,2) as c3, substr(c1,9,5) as c4, substr(c1,14,3) as c5, substr(c1,17,3) as c6 from t1 *************************** 1. row *************************** sample_data: ABC余振興CDMySQLEFG數(shù)據(jù)庫(kù) c1: ABC c2: 余振興 c3: CD c4: MySQL c5: EFG c6: 數(shù)據(jù)庫(kù)
定長(zhǎng)數(shù)據(jù)導(dǎo)入需要明確每列數(shù)據(jù)占用的字符個(gè)數(shù),以下直接使用 rpad 對(duì)現(xiàn)有的表數(shù)據(jù)填充空格的方式生成定長(zhǎng)數(shù)據(jù)用作示例使用
-- 生成定長(zhǎng)數(shù)據(jù) SQL > select concat(rpad(emp_no,10,' '), rpad(birth_date,19,' '), rpad(first_name,14,' '), rpad(last_name,16,' '), rpad(gender,2,' '), rpad(hire_date,19,' ')) as fixed_length_data from employees.employees limit 10; +----------------------------------------------------------------------------------+ | fixed_length_data | +----------------------------------------------------------------------------------+ | 10001 1953-09-02 Georgi Facello M 1986-06-26 | | 10002 1964-06-02 Bezalel Simmel F 1985-11-21 | | 10003 1959-12-03 Parto Bamford M 1986-08-28 | | 10004 1954-05-01 Chirstian Koblick M 1986-12-01 | | 10005 1955-01-21 Kyoichi Maliniak M 1989-09-12 | | 10006 1953-04-20 Anneke Preusig F 1989-06-02 | | 10007 1957-05-23 Tzvetan Zielinski F 1989-02-10 | | 10008 1958-02-19 Saniya Kalloufi M 1994-09-15 | | 10009 1952-04-19 Sumant Peac F 1985-02-18 | | 10010 1963-06-01 Duangkaew Piveteau F 1989-08-24 | +----------------------------------------------------------------------------------+ -- 導(dǎo)出定長(zhǎng)數(shù)據(jù) select concat(rpad(emp_no,10,' '), rpad(birth_date,19,' '), rpad(first_name,14,' '), rpad(last_name,16,' '), rpad(gender,2,' '), rpad(hire_date,19,' ')) as fixed_length_data into outfile '/data/mysql/3306/tmp/employees_fixed.txt' character set utf8mb4 lines terminated by '\n' from employees.employees limit 10; -- 導(dǎo)出數(shù)據(jù)示例 [root@10-186-61-162 tmp]# cat employees_fixed.txt 10001 1953-09-02 Georgi Facello M 1986-06-26 10002 1964-06-02 Bezalel Simmel F 1985-11-21 10003 1959-12-03 Parto Bamford M 1986-08-28 10004 1954-05-01 Chirstian Koblick M 1986-12-01 10005 1955-01-21 Kyoichi Maliniak M 1989-09-12 10006 1953-04-20 Anneke Preusig F 1989-06-02 10007 1957-05-23 Tzvetan Zielinski F 1989-02-10 10008 1958-02-19 Saniya Kalloufi M 1994-09-15 10009 1952-04-19 Sumant Peac F 1985-02-18 10010 1963-06-01 Duangkaew Piveteau F 1989-08-24 -- 導(dǎo)入定長(zhǎng)數(shù)據(jù) load data infile '/data/mysql/3306/tmp/employees_fixed.txt' replace into table demo.emp character set utf8mb4 fields terminated by ',' enclosed by '"' lines terminated by '\n' (@row) -- 對(duì)一行數(shù)據(jù)定義為一個(gè)整體 set emp_no = trim(substr(@row,1,10)),-- 使用substr取前10個(gè)字符,并去除頭尾空格數(shù)據(jù) birth_date = trim(substr(@row,11,19)),-- 后續(xù)字段以此類推 first_name = trim(substr(@row,30,14)), last_name = trim(substr(@row,44,16)), fullname = concat(first_name,' ',last_name), -- 對(duì)first_name和last_name做拼接 gender = trim(substr(@row,60,2)), hire_date = trim(substr(@row,62,19)), modify_date = now(), delete_flag = if(hire_date'1988-01-01','Y','N'); -- 對(duì)需要生成的值基于某一列做條件運(yùn)算
1.默認(rèn)情況下導(dǎo)入的順序以文本文件 列-從左到右,行-從上到下
的順序?qū)?/p>
2.如果表結(jié)構(gòu)和文本數(shù)據(jù)不一致,建議將文本文件中的各列依次順序編號(hào)并與表中字段建立 mapping 關(guān)系,以防數(shù)據(jù)導(dǎo)入到錯(cuò)誤的字段
3.對(duì)于待導(dǎo)入的文本文件較大的場(chǎng)景,建議將文件 按行拆分
為多個(gè)小文件,如用 split 拆分
4.對(duì)文件導(dǎo)入后建議執(zhí)行以下語(yǔ)句驗(yàn)證導(dǎo)入的數(shù)據(jù)是否有 Warning
,ERROR
以及導(dǎo)入的數(shù)據(jù)量
GET DIAGNOSTICS @p1=NUMBER,@p2=ROW_COUNT;
select @p1 AS ERROR_COUNT,@p2 as ROW_COUNT;
5.文本文件數(shù)據(jù)與表結(jié)構(gòu)存在過大的差異或數(shù)據(jù)需要做清洗轉(zhuǎn)換,建議還是用專業(yè)的 ETL 工具或先粗略導(dǎo)入 MySQL 中再進(jìn)行加工轉(zhuǎn)換處理
以上就是 MySQL Load Data 數(shù)據(jù)的多種用法的詳細(xì)內(nèi)容,更多關(guān)于MySQL Load Data 的用法的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!,希望大家以后多多支持腳本之家!
標(biāo)簽:溫州 定西 三明 揚(yáng)州 福州 山西 阿里 無錫
巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《MySQL數(shù)據(jù)庫(kù) Load Data 多種用法》,本文關(guān)鍵詞 MySQL,數(shù)據(jù)庫(kù),Load,Data,多種,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題,煩請(qǐng)?zhí)峁┫嚓P(guān)信息告之我們,我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無關(guān)。