urllib - URL Handling Modules
The urllib package is a collection of modules for working with URLs. It provides a high-level interface for fetching data across the world wide web.
🔍 Submodules
Core Modules
- urllib.request - Open URLs and handle HTTP requests
- urllib.parse - Parse URLs into components
- urllib.error - Exception classes for urllib operations
Additional Modules
- urllib.robotparser - Parse robots.txt files
Request Classes
- urllib.request.Request - Abstraction of a URL request
- urllib.request.OpenerDirector - URL opening director
🚀 Quick Reference
| Module | Purpose | Key Functions |
|---|---|---|
urllib.request | HTTP requests | urlopen(), urlretrieve() |
urllib.parse | URL parsing | urlparse(), urljoin(), quote() |
urllib.error | Error handling | URLError, HTTPError |
urllib.robotparser | Robots.txt | RobotFileParser |
🌐 Basic Usage
import urllib.request
import urllib.parse
# Simple GET request
response = urllib.request.urlopen('https://api.example.com/data')
data = response.read()
# Parse URL
parsed = urllib.parse.urlparse('https://example.com/path?query=value')
print(parsed.scheme, parsed.netloc, parsed.path)
# Encode URL parameters
params = urllib.parse.urlencode({'key': 'value', 'foo': 'bar'})
💡 Alternatives
For more advanced HTTP operations, consider:
requests- Third-party library with a more user-friendly APIhttpx- Modern async-capable HTTP clientaiohttp- Async HTTP client/server framework