The previous strategy is coded as follows:
- Let's fetch the headline data from the API provided by the Guardian, as follows:
from bs4 import BeautifulSoupimport urllib, jsondates = []titles = []for i in range(100): try: url = 'https://content.guardianapis.com/search?from-date=2010-01-01§ion=business&page-size=200&order-by=newest&page='+str(i+1)+'&q=amazon&api-key=207b6047-a2a6-4dd2-813b-5cd006b780d7' response = urllib.request.urlopen(url) encoding = response.info().get_content_charset('utf8') data = json.loads(response.read().decode(encoding)) for j in range(len(data['response']['results'])): dates.append(data['response']['results'][j]['webPublicationDate']) titles.append(data['response']['results'][j]['webTitle'])