#webscraping


webscraping arxiv.org with python

I scraped Arxiv.org with python and a few other modules. I print all article titles and ask user for which title they want, then I send then to the link.


import requests import urllib.request import time from bs4 import BeautifulSoup import webbrowser url = 'https://arxiv.org/' response = requests.get(url) soup = BeautifulSoup(response.text, "html.parser") one_a_tag = soup.findAll('a')[12] print(one_a_tag.text) link = 'arxiv.org'+one_a_tag['href'] n=1 counter = 0 y={} for i in soup.findAll('a'): counter+=1 if counter > 12 and counter<237: print(str(n)) print(i.text) y.update({i.text:i['href']}) n+=1 z=input('what article title do you want to view') print(y[z]) website='arxiv.org'+y[z] webbrowser.open_new(website)
1