Scrapy "missing Scheme In Request Url"
Here's my code below- import scrapy from scrapy.http import Request class lyricsFetch(scrapy.Spider): name = 'lyricsFetch' allowed_domains = ['metrolyrics.com'] print '\
Solution 1:
As @tintin said, you are missing the http
scheme in the URLs. Scrapy needs fully qualified URLs in order to process the requests.
As far I can see, you are missing the scheme in:
start_urls = ["www.lyricsmode.com/lyrics/ ...
and
yieldRequest("www.lyricsmode.com/feed.xml")
In case you are parsing URLs from the HTML content, you should use urljoin
to ensure you get a fully qualified URL, for example:
next_url = response.urljoin(href)
Solution 2:
I also encountered this problem today, URL usually has a scheme, which is very common, such as HTTP, HTTPS in url .
It should be that urls you extract from start_url response without HTTP, HTTPS such as //list.jd.com/list.html
.
You should add the scheme in url It should be https://list.jd.com/list.html
Post a Comment for "Scrapy "missing Scheme In Request Url""