正则表达式 - Python正则匹配出一个完整的URL
迷茫
迷茫 2017-04-18 09:36:17
0
2
334

URL如果考虑可以用所有Unicode字符的话,那就只有空格能判断结尾了,请问这样的话怎么在一段文本中完整提取一个URL,不多不少

"http://www.w3schools.com/学习HTML技术,做一个有趣的网站"

比如上面这个例子,没有空格,是不是没办法提取只提取前面那部分了?

补充:

当然上面只是个例子,希望是能通用一点,貌似做不到吧?

折衷一点也就是提取到 .com/这一类域名这儿了吧?

是不是没办法判断一个完整的URL?

迷茫
迷茫

业精于勤,荒于嬉;行成于思,毁于随。

reply all(2)
左手右手慢动作
  • I don’t know if the questioner is just referring to this website? If it is this URL, it can match the part starting with 'http' and ending with '/'.

  • Personally, I think what the questioner wants to express is how to distinguish it under normal circumstances (without the special character ‘/’ at the end)?
    I think the matching ending can be solved with alphanumeric or specific suffix com/cn (this method cannot be completely covered by enumeration)

伊谢尔伦

The poster did not describe clearly what kind of URL he wanted~
If you only want the non-Chinese part, you can write like this:
'https?://[^/]+?/'


Owner, you need to know that URLs can contain Chinese characters. The example you gave:
http://www.w3schools.com/学习HTML技术,做一个有趣的网站
can completely be used as a URL.
So, you still haven’t described clearly what you want?

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!