Seorang pemula ingin meminta nasihat tentang cara menulis gelung dict ke dalam fail csv dalam python3 (masalah yang dihadapi semasa merangkak)?
我想大声告诉你
我想大声告诉你 2017-05-18 10:49:20
0
3
1264

Selepas perangkak menjana dict, saya ingin menulisnya ke dalam fail csv, tetapi ralat berlaku
Gunakan jupyter notebook dan persekitaran tetingkap.

Kod khusus adalah seperti berikut

import requests from multiprocessing.dummy import Pool as ThreadPool from lxml import etree import sys import time import random import csv def spider(url): header={ 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36' } timeout=random.choice(range(31,50)) html = requests.get(url,header,timeout=timeout) time.sleep(random.choice(range(8,16))) selector = etree.HTML(html.text) content_field = selector.xpath('//*[@class="inner"]/p[3]/p[2]/ul/li') item ={} for each in content_field: g = each.xpath('a/p[1]/p[1]/h3/span/text()') go = each.xpath('a/p[1]/p[2]/p/h3/text()') h = each.xpath('a/p[1]/p[2]/p/p/text()[1]') j= each.xpath('a/p[1]/p[1]/p/text()[2]') ge = each.xpath('a/p[1]/p[2]/p/p/text()[3]') x = each.xpath('a/p[1]/p[1]/p/text()[3]') city = each.xpath('a/p[1]/p[1]/p/text()[1]') gg = each.xpath('a/p[2]/span/text()') item['city']="".join(city) item['hangye']="".join(hangye) item['guimo']="".join(guimo) item['gongsi']="".join(gongsi) item['gongzi']="".join(gongzi) item['jingyan']="".join(jingyan) item['xueli']="".join(xueli) item['gongzuoneirong']="".join(gongzuoneirong) fieldnames =['city','hangye','guimo','gongsi','gongzi','jingyan','xueli','gongzuoneirong'] with open('bj.csv','a',newline='',errors='ignore')as f: f_csv=csv.DictWriter(f,fieldnames=fieldnames) f_csv.writeheader() f_csv.writerow(item) if __name__ == '__main__': pool = ThreadPool(4) f=open('bj.csv','w') page = [] for i in range(1,100): newpage = 'https://www.zhipin.com/c101010100/h_101010100/?query=%E6%95%B0%E6%8D%AE%E8%BF%90%E8%90%A5&page='+str(i) + '&ka=page-' + str(i) page.append(newpage) results = pool.map(spider,page) pool.close() pool.join() f.close()

Jalankan kod di atas dan mesej ralat ialah

ValueError: terlalu banyak nilai untuk dibongkar (dijangka 2)
Sebab pertanyaan ialah untuk melintasi dict, bentuk dict.items() diperlukan. Tetapi bagaimana untuk melaksanakannya dalam kod di atas belum diluruskan. Sila beri saya nasihat

我想大声告诉你
我想大声告诉你

membalas semua (3)
習慣沉默

不好意思哈,现在才有时间来回答你的问题,看到你根据我的建议把代码改过来了,下面我把改过的代码贴出来,我运行过,是没问题的

import requests from multiprocessing.dummy import Pool from lxml import etree import time import random import csv def spider(url): header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36' } timeout = random.choice(range(31, 50)) html = requests.get(url, headers=header, timeout=timeout) time.sleep(random.choice(range(8, 16))) selector = etree.HTML(html.text) content_field = selector.xpath('//*[@class="inner"]/p[3]/p[2]/ul/li') item = {} for each in content_field: g = each.xpath('a/p[1]/p[1]/h3/span/text()') go = each.xpath('a/p[1]/p[2]/p/h3/text()') h = each.xpath('a/p[1]/p[2]/p/p/text()[1]') j = each.xpath('a/p[1]/p[1]/p/text()[2]') ge = each.xpath('a/p[1]/p[2]/p/p/text()[3]') x = each.xpath('a/p[1]/p[1]/p/text()[3]') city = each.xpath('a/p[1]/p[1]/p/text()[1]') gg = each.xpath('a/p[2]/span/text()') item['city'] = "".join(city) item['hangye'] = "".join(g) item['guimo'] = "".join(go) item['gongsi'] = "".join(h) item['gongzi'] = "".join(j) item['jingyan'] = "".join(ge) item['xueli'] = "".join(x) item['gongzuoneirong'] = "".join(gg) fieldnames = ['city', 'hangye', 'guimo', 'gongsi', 'gongzi', 'jingyan', 'xueli', 'gongzuoneirong'] with open('bj.csv', 'a', newline='', errors='ignore')as f: f_csv = csv.DictWriter(f, fieldnames=fieldnames) f_csv.writeheader() f_csv.writerow(item) if __name__ == '__main__': f = open('bj.csv', 'w') page = [] for i in range(1, 100): newpage = 'https://www.zhipin.com/c101010100/h_101010100/?query=%E6%95%B0%E6%8D%AE%E8%BF%90%E8%90%A5&page=' + str( i) + '&ka=page-' + str(i) page.append(newpage) print(page) pool = Pool(4) results = pool.map(spider, page) pool.close() pool.join() f.close()

这里主要是header,你原来是set类型,我修改后是dict类型

这里还需要给你一些建议

  1. 你的代码是放到ide还是文本编辑器中运行的?有的东西在ide下明显会报错啊

  2. 建议新手从开始学的时候就遵守PEP8规范,别养成了坏习惯,你看看你的命名

    过去多啦不再A梦
    item = {'a':1, 'b':2} fieldnames = ['a', 'b'] with open('test.csv', 'a') as f: f_csv = DictWriter(f, fieldnames=fieldnames) f_csv.writeheader() f_csv.writerow(item)

    我这样写并没报错喔

    writerow就是直接接收dict的吧,你这个问题,我感觉是因为item的key与你表头不对应

      漂亮男人

      因为在 fields 中指定的某些列名在 item 中不存在

        Muat turun terkini
        Lagi>
        kesan web
        Kod sumber laman web
        Bahan laman web
        Templat hujung hadapan
        Tentang kita Penafian Sitemap
        Laman web PHP Cina:Latihan PHP dalam talian kebajikan awam,Bantu pelajar PHP berkembang dengan cepat!