Shell/Python實現Mysql讀
- 2020 年 1 月 6 日
- 筆記
文本:ttt.txt 共7774865條記錄
顯示部分文本內容如下:
"OBJECT_ID","OBJECT_NAME","CREATED"
"20","ICOL$","2013/10/9 18:23:42"
"46","I_USER1","2013/10/9 18:23:42"
"28","CON$","2013/10/9 18:23:42"
"15","UNDO$","2013/10/9 18:23:42"
"29","C_COBJ#","2013/10/9 18:23:42"
"3","I_OBJ#","2013/10/9 18:23:42"
"25","PROXY_ROLE_DATA$","2013/10/9 18:23:42"
"41","I_IND1","2013/10/9 18:23:42"
"54","I_CDEF2","2013/10/9 18:23:42"
"40","I_OBJ5","2013/10/9 18:23:42"
"26","I_PROXY_ROLE_DATA$_1","2013/10/9 18:23:42"
"17","FILE$","2013/10/9 18:23:42"
"13","UET$","2013/10/9 18:23:42"
"9","I_FILE#_BLOCK#","2013/10/9 18:23:42"
"43","I_FILE1","2013/10/9 18:23:42"
"51","I_CON1","2013/10/9 18:23:42"
"38","I_OBJ3","2013/10/9 18:23:42"
"7","I_TS#","2013/10/9 18:23:42"
"56","I_CDEF4","2013/10/9 18:23:42"
……
……
為了方便測試:
我們選取前10001行,共10000條數據(第一條為欄位名)
[11:05:12 wsdf@localhost Desktop]$ head -10001 ttt.txt > test.txt
如下是本人分別用python與shell編寫的腳本,不代表腳本執行的方法是最優的,所作的比較僅為個人主觀意見
1.Shell版
#!/bin/sh #mysql_import.sh #讀ttt.txt文件,將其中各項寫入資料庫 i=0 # ###################設定新分隔符#################### SAVEDIFS=$IFS IFS=',' #mysql連接函數 mysql_conn(){ mysql -e $1; } echo "Begin time: `date`" #顯示開始時間 ################創建資料庫myimport################# mysql -e "drop database if exists myimport;" #mysql_conn "drop database if exists myimport;" mysql -e "create database myimport;" #mysql_conn "create database myimport;" #mysql -e "show databases;" ###############對ttt.txt進行處理################### sed -e 's/"//g' -e 's#/#-#g' ttt.txt >t.txt #i=0時,讀文本第一行,創建表import_obj,設置i=1 #i=1,讀文本其他行,插入數據到import_obj中 #逐行讀數據,並通過mysql -e command插入到表中 while read ID NAME CREATED do if [ $i -eq 0 ]; then #echo $ID,$NAME,$CREATED id=$ID name=$NAME created=$CREATED mysql -e "create table if not exists myimport.import_obj(id int unsigned auto_increment primary key,$ID int unsigned unique not null,$NAME varchar(60) not null,$CREATED datetime not null);" i=1 #mysql -e "desc myimport.import_obj;" else #echo "$CREATED" #mysql -e "select str_to_date($CREATED,'%Y-%m-%d %H:%i:%s');" mysql -e "insert into myimport.import_obj($id,$name,$created) value($ID,'$NAME','$CREATED');" fi done <t.txt echo "END time: `date`" #顯示結束時間 #mysql -e "select * from myimport.import_obj;" rm -f t.txt #刪除臨時文件 IFS=$SAVEDIFS #還原分隔符
2.Python版(Python 2.7.5版本)
#!/usr/bin/python # mysql_import.py #使用MySQLdb去實現 import MySQLdb,os,time os.system("sed -e 's/"//g' -e 's#/#-#g' test.txt >t.txt"); try: print time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(time.time())) #記錄開始時間 conn=MySQLdb.connect(host='127.0.0.1',user='root',passwd='zhang1992') print "conn success" except: print "conn error!" exit() cur=conn.cursor() cur.execute('drop database if exists myimport') cur.execute('create database myimport') i=0 count=0 with open('t.txt','r') as ft: for row in ft.readlines(): ID,NAME,CREATED=list(row.strip("n").split(",")) if i==0: cid=ID cname=NAME created=CREATED cur.execute("create table if not exists myimport.import_obj(id int unsigned auto_increment primary key,%s int unsigned unique not null,%s varchar(60) not null,%s datetime not null)"%(ID,NAME,CREATED)) i=1 else: sql="insert into myimport.import_obj(%s,%s,%s) value(%d,'%s','%s')"%(cid,cname,created,int(ID),NAME,CREATED) cur.execute(sql) os.system("rm -f t.txt") cur.close() conn.commit() conn.close() print time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(time.time())) #記錄結束始時間
3.測試
10000條數據測試
[11:08:32 wsdf@localhost Desktop]$ bash mysql_import.sh Begin time: Tue Jul 19 11:08:35 CST 2016 End time: Tue Jul 19 11:11:50 CST 2016 [11:11:50 wsdf@localhost Desktop]$ python mysql_import.py 2016-07-19 11:16:46 conn success 2016-07-19 11:16:52
python處理完整文本測試
[13:06:30 wsdf@localhost Desktop]$ python mysql_import.py 2016-07-19 13:06:35 conn success 2016-07-19 13:14:57 [13:14:57 wsdf@localhost Desktop]$ mysql -e "select count(*) from myimport.import_obj"; +----------+ | count(*) | +----------+ | 774864 | +----------+
從上面的測試結果可以看出,python的效率明顯高於shell。
shell無法保存mysql的連接狀態,導致每執行一條插入語句都需要重新連接mysql及斷開。這裡的python中的MySQLdb通過事務,全部執行完畢才提交,一次提交完成那個所有的插入,節省連接與斷開的時間。