How to find the course text "New York City, USA" in Python using BeautifulSoup?
Tried copying the video to practice, but it no longer works.
Tried to find something in the official documentation, but no success. Or is my get_html_content function not working properly and Google is just blocking me, thus returning an empty list / None ?
This is my current code:
from django.shortcuts import render
import requests
def get_html_content(city):
USER_AGENT = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
LANGUAGE = "en-US,en;q=0.5"
session = requests.Session()
session.headers['User-Agent'] = USER_AGENT
session.headers['Accept-Language'] = LANGUAGE
session.headers['Content-Language'] = LANGUAGE
city.replace(" ", "+")
html_content = session.get(f"https://www.google.com/search?q=weather+in+{city}").text
return html_content
def home(request):
result = None
if 'city' in request.GET:
city = request.GET.get('city')
html_content = get_html_content(city)
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
soup.find_all('div', attrs={'class': 'wob_loc q8U8x'})
**OR**
soup.find_all('div', attrs={'id': 'wob_loc'})
--> Both return an empty list (= .find method returns None)
The layout of the Google page may have changed at the same time, so to get data about the weather you must change your code. For example:
import requests from bs4 import BeautifulSoup params = {'q':'weather in New York City, New York, USA', 'hl': 'en'} headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0'} cookies = {'CONSENT':"YES+cb.20220419-08-p0.cs+FX+111"} url = 'https://www.google.com/search' soup = BeautifulSoup(requests.get(url, params=params, headers=headers, cookies=cookies).content, 'html.parser') for t in soup.select('#wob_dp [aria-label]'): how = t.find_next('img')['alt'] temp = t.find_next('span').get_text(strip=True) print('{:Print: