python(3.x)读取csv:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | import csv filename = 'xxx.csv' fi = [] ls = [] with open(filename, 'r', encoding='gbk') as csvfile: reader = csv.reader(csvfile) fi = next(reader) for lr in reader: ls.append(lr) for lr in ls[:5]: print(lr) |
当然,使用专业类库,会更简单:
1 2 3 4 5 6 7 | import pandas as pd filename = 'xxx.csv' data = pd.read_csv(filename, encoding='gbk') print(data.head(5)) |
读取json:
1 2 3 4 5 6 7 8 | import json filename = 'xxx.json' with open(filename) as f: jsondata = json.load(f) print(jsondata) |
当然,pandas里也有更高效的读取json的read_json(),不过用它读取多维不定格式的json时,会报错“ValueError: arrays must all be same length”,这里就不再演示了
读取xml,并转变为json:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | import json import xml.etree.ElementTree as ET import xmltodict filename = 'xxx/s1.xml' tree = ET.parse(filename) xmldata = tree.getroot() xmlstr = ET.tostring(xmldata, encoding='utf8', method='xml') xmldict = xmltodict.parse(xmlstr, encoding='utf8') datadict = dict(xmldict) print(datadict) filename2 = 'xxxx/s1.json' with open(filename2, 'w+') as jsonfile: json.dump(datadict, jsonfile, indent=4, sort_keys=True) |
注:
1. xmltodict可能需要安装:pip install xmltodict
2. xmltodict.parse对xml格式要求较严格,如果某节点字符串开头第一个字符是数值,createElement时可能会报错