Downloading Files From Filetype Fields?
Solution 1:
As @JohnZwinck suggested you can use urllib.urlretrieve
and use the re
module to create a list of links on a given page and download each file. Below is an example.
#!/usr/bin/python
"""
This script would scrape and download files using the anchor links.
"""
#Imports
import os, re, sys
import urllib, urllib2
#Config
base_url = "http://www.google.com/"
destination_directory = "downloads"
def _usage():
"""
This method simply prints out the Usage information.
"""
print "USAGE: %s <url>" %sys.argv[0]
def _create_url_list(url):
"""
This method would create a list of downloads, using the anchor links
found on the URL passed.
"""
raw_data = urllib2.urlopen(url).read()
raw_list = re.findall('<a style="display:inline; position:relative;" href="(.+?)"', raw_data)
url_list = [base_url + x for x in raw_list]
return url_list
def _get_file_name(url):
"""
This method will return the filename extracted from a passed URL
"""
parts = url.split('/')
return parts[len(parts) - 1]
def _download_file(url, filename):
"""
Given a URL and a filename, this method will save a file locally to the»
destination_directory path.
"""
if not os.path.exists(destination_directory):
print 'Directory [%s] does not exist, Creating directory...' % destination_directory
os.makedirs(destination_directory)
try:
urllib.urlretrieve(url, os.path.join(destination_directory, filename))
print 'Downloading File [%s]' % (filename)
except:
print 'Error Downloading File [%s]' % (filename)
def _download_all(main_url):
"""
Given a URL list, this method will download each file in the destination
directory.
"""
url_list = _create_url_list(main_url)
for url in url_list:
_download_file(url, _get_file_name(url))
def main(argv):
"""
This is the script's launcher method.
"""
if len(argv) != 1:
_usage()
sys.exit(1)
_download_all(sys.argv[1])
print 'Finished Downloading.'
if __name__ == '__main__':
main(sys.argv[1:])
You can Change the base_url
and the destination_directory
according to your needs and save the script as download.py
. Then from the terminal use it like
python download.py http://www.example.com/?page=1
Solution 2:
We can't know what service you got that first image from, but we'll assume it's on a website of some kind--probably one internal to your company.
The easiest things you can try are to use urllib.urlretrieve to "get" the file based on its URL. You may be able to do this if you can right-click the link on that page, copy the URL, and paste it into your code.
However, that may not work, for example if there is complex authentication required before accessing that page. You might need to write Python code that actually does the login (as if the user were controlling it, typing a password). If you get that far, you should post that as a separate question.
Post a Comment for "Downloading Files From Filetype Fields?"