如何用scrapy提取不在标签内的文字
答案:2 悬赏:70 手机版
解决时间 2021-02-21 10:40
- 提问者网友:相思似海深
- 2021-02-21 00:26
如何用scrapy提取不在标签内的文字
最佳答案
- 五星知识达人网友:洎扰庸人
- 2021-02-21 01:09
代码如下
def parse(self,response):
states = {}
list1 = []
list2 = []
for row in response.xpath("//*[@id='info']/*"):
if row.xpath("span[@class='pl']/text()"):
title = row.xpath("span[@class='pl']/text()").extract()[0].strip()
text = row.xpath("a/text()").extract()[0].strip()
states[title]=text
elif row.xpath("text()"):
list1.append(row.xpath("text()").extract()[0].strip()[:-1])
for row in response.xpath("//*[@id='info']/text()").extract():
if row.strip():
list2.append(row.strip())
for i in range(len(list1)):
states[list1[i]]=list2[i]
for n in states:
print n,states[n]
def parse(self,response):
states = {}
list1 = []
list2 = []
for row in response.xpath("//*[@id='info']/*"):
if row.xpath("span[@class='pl']/text()"):
title = row.xpath("span[@class='pl']/text()").extract()[0].strip()
text = row.xpath("a/text()").extract()[0].strip()
states[title]=text
elif row.xpath("text()"):
list1.append(row.xpath("text()").extract()[0].strip()[:-1])
for row in response.xpath("//*[@id='info']/text()").extract():
if row.strip():
list2.append(row.strip())
for i in range(len(list1)):
states[list1[i]]=list2[i]
for n in states:
print n,states[n]
全部回答
- 1楼网友:动情书生
- 2021-02-21 02:10
xpath 如果返回的是多个元素的话,比如你这里就是多个 那就要用到循环 content=""for selector in sel.xpath('//div[@class="document"]//p'): content=content+ selector.xpath("/text()").extract()
我要举报
如以上问答信息为低俗、色情、不良、暴力、侵权、涉及违法等信息,可以点下面链接进行举报!
大家都在看
推荐资讯