Extract Information From Website Using Xpath, Python
Trying to extract some useful information from a website. I came a bit now im stuck and in need of your help! I need the information from this table http://gbgfotboll.se/serier/?sc
Solution 1:
I think is it what you want:
#coding: utf-8from lxml import etree
import lxml.html
collected = [] #list-tuple of [(col1, col2...), (col1, col2...)]
dom = lxml.html.parse("http://gbgfotboll.se/serier/?scr=scorers&ftid=57700")
#alltablerows
xpatheval = etree.XPathDocumentEvaluator(dom)
rows= xpatheval('//div[@id="content-primary"]/div/table[1]/tbody/tr')
# If there are less than 12rows (or<=12): Take all the rowsexcept the last.
if len(rows) <=12:
rows.pop()
else:
# If there are more than 12rows: Simply take the first12 rows.
rows=rows[0:12]
forrowinrows:
# all columns ofcurrenttablerow (Spelare, Lag, Mal, straffmal)
columns = row.findall("td")
# pick textual data fromeach<td>
collected.append([column.text forcolumnin columns])
for i in collected: print i
Post a Comment for "Extract Information From Website Using Xpath, Python"