Showing posts with label DOM. Show all posts
Showing posts with label DOM. Show all posts

Python 3 for fun - Screen Scraping



Download Python 3: https://www.python.org/downloads/

Open cmd, type: python

from lxml import html
import requests

#Initialize your page and tree list variables
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)

See your page variable's output:
(Excerpts taken from Python.org, I've updated the instructions for Python 3+)

#This will create a list of buyer elements from the DOM (divs with a specific 'title' attribute):
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of price elements from the DOM (spans with a specific class):
prices = tree.xpath('//span[@class="item-price"]/text()')

print ('Buyers: ', buyers)
print ('Prices: ', prices)


Expected output:

You can get a lot of useful things done in Python with relatively few lines of code.