进行数据处理时数据量一大,excel文件就力不从心。
这次对三个文件格式的读取速度做大比拼。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| import time import pandas as pd """ csv excel pkl 速度大比拼
""" start = time.clock() df = pd.read_pickle('table.pkl') elapsed = (time.clock() - start) print("PKL Time used:", elapsed)
start = time.clock() df = pd.read_csv('table.csv') elapsed = (time.clock() - start) print("CSV Time used:", elapsed)
start = time.clock() df = pd.read_excel('table.xlsx') elapsed = (time.clock() - start) print("EXCEL Time used:", elapsed)
|
输出结果
1 2 3
| PKL Time used: 0.0913808 CSV Time used: 0.2128232 EXCEL Time used: 10.9964416
|
pickle完美胜出。参考链接中有大佬的更详细的比拼。
参考
https://www.jianshu.com/p/d857c0f472f4