Optimizing HTTP Requests in Python
The need to send numerous HTTP requests swiftly in Python often arises, especially when dealing with large datasets. However, choosing the most efficient approach amidst the various concurrency and threading options in Python can be challenging. One viable solution lies in utilizing a simple yet effective method.
Efficient HTTP Request Implementation
The following code exemplifies a highly efficient implementation in Python (2.6 compatibility):
import urlparse from threading import Thread import httplib, sys from Queue import Queue concurrent = 200 def doWork(): while True: url = q.get() status, url = getStatus(url) doSomethingWithResult(status, url) q.task_done() def getStatus(ourl): try: url = urlparse(ourl) conn = httplib.HTTPConnection(url.netloc) conn.request("HEAD", url.path) res = conn.getresponse() return res.status, ourl except: return "error", ourl def doSomethingWithResult(status, url): print status, url q = Queue(concurrent * 2) for i in range(concurrent): t = Thread(target=doWork) t.daemon = True t.start() try: for url in open('urllist.txt'): q.put(url.strip()) q.join() except KeyboardInterrupt: sys.exit(1)
Explanation
This optimized solution outperforms traditional methods, utilizing a streamlined approach that balances resource usage and task execution speed.
The above is the detailed content of How can I optimize HTTP requests in Python for efficient data processing?. For more information, please follow other related articles on the PHP Chinese website!