python - re小问题,新手轻喷
天蓬老师
天蓬老师 2017-04-17 17:56:47
0
2
364

尝试抓取instagram图片分享地址从而下载图片

# -*- coding: utf-8 -*-
import urllib2
import re

response = urllib2.urlopen('https://www.instagram.com/p/BG5SpsYuSr-/')
html = response.read()  
#print html

catch = re.compile(r'//*[display_src="(.+?\.jpg)"]')
urls = re.findall(catch,html)
for i, url in enumerate(urls):
    print url
    

查看源代码发现图片地址在这两个地方

想请教一下各位怎样抓取图片的下载地址?

天蓬老师
天蓬老师

欢迎选择我的课程,让我们一起见证您的进步~~

reply all(2)
PHPzhong
from pyquery import PyQuery as Q
import urllib2

response = urllib2.urlopen('https://www.instagram.com/p/BG5SpsYuSr-/')
html = response.read()
print Q(html).find('meta[property="og:image"]').attr('content')
黄舟

As you can see from the second picture, the image address is in the js object. According to experience, the image is probably added by js. I can't enter the target website, so I don't know what it looks like.
You can try to use this regular expression to extract the js object, convert it into json, and then get the data you want just like operating a dictionary

<script type="text/javascript">[\w ]+=([\s\S]+?);</script>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template