📚 Official Documentation & Resources
- Python Official Documentation - Complete API reference and examples
- PEP 289 - User classes design patterns and rationale
- Real Python Tutorial - In-depth tutorial with practical examples
- Python Module of the Week - Comprehensive examples and use cases
- GeeksforGeeks Guide - Beginner-friendly tutorial
- Python Tips Blog - Quick reference and tips
Overview
collections.UserString is a wrapper around string objects that provides a convenient base class for creating custom string-like classes. Unlike inheriting from str directly, UserString stores the underlying data in a regular string accessible via the data attribute, making it easier to override methods without worrying about circular calls or immutability constraints of the built-in str type.
🎯 Key Characteristics
- String Wrapper - Wraps a regular string in the
dataattribute - Safe Subclassing - Avoids complex inheritance issues with immutable str
- Method Override Friendly - Easy to customize specific string behaviors
- Full String Interface - Supports all standard string operations
- Mutable Design - Unlike str, allows modification of the underlying data
- Initialization Flexibility - Multiple ways to initialize with string data
📚 Basic Usage
Simple Example
from collections import UserString
# Basic UserString usage
user_str = UserString("Hello, World!")
print(user_str) # Hello, World!
print(len(user_str)) # 13
print(user_str.upper()) # HELLO, WORLD!
# Access underlying data
print(user_str.data) # Hello, World!
# Custom string with validation
class SafeString(UserString):
def __init__(self, data):
# Remove potentially dangerous characters
cleaned = str(data).replace('<', '').replace('>', '').replace('&', '')
super().__init__(cleaned)
def append(self, text):
"""Add append method to string (normally immutable)."""
cleaned = str(text).replace('<', '').replace('>', '').replace('&', '')
self.data += cleaned
return self
# Usage with validation
safe = SafeString("Hello <script>")
print(safe) # Hello script (tags removed)
safe.append(" World!")
print(safe) # Hello script World!
Core Methods
from collections import UserString
# Initialize with different methods
empty_str = UserString()
from_string = UserString("Python")
from_number = UserString(42)
# Access data attribute
print(from_string.data) # Python
# All regular string operations work
print(len(from_string)) # 6
print('Py' in from_string) # True
print(from_string[0]) # P
🔧 UserString API Reference
Methods
UserString supports all standard string methods. Here are the most commonly used ones:
| Method | Description | Return Type | Example |
|---|---|---|---|
__init__(data='') | Initialize UserString with data | UserString | UserString("hello") |
__str__() | String representation | str | str(user_str) |
__len__() | Get string length | int | len(user_str) |
__getitem__(index) | Get character by index | str | user_str[0] |
__contains__(substr) | Check if substring exists | bool | 'sub' in user_str |
upper() | Convert to uppercase | UserString | user_str.upper() |
lower() | Convert to lowercase | UserString | user_str.lower() |
strip(chars=None) | Remove whitespace/chars | UserString | user_str.strip() |
replace(old, new, count=-1) | Replace substring | UserString | user_str.replace('a', 'b') |
split(sep=None, maxsplit=-1) | Split into list | list | user_str.split(' ') |
join(iterable) | Join iterable with string | UserString | user_str.join(['a', 'b']) |
startswith(prefix) | Check if starts with prefix | bool | user_str.startswith('Hello') |
endswith(suffix) | Check if ends with suffix | bool | user_str.endswith('!') |
find(substr, start=0, end=None) | Find substring index | int | user_str.find('sub') |
count(substr, start=0, end=None) | Count occurrences | int | user_str.count('a') |
format(*args, **kwargs) | Format string | UserString | user_str.format(name='Alice') |
Properties/Attributes
| Attribute | Description | Type | Example |
|---|---|---|---|
data | Underlying string storing the data | str | user_str.data |
Detailed Method Examples
from collections import UserString
# Initialize test UserString
us = UserString(" Hello, Python World! ")
print(f"Original: '{us}'") # ' Hello, Python World! '
print(f"Length: {len(us)}") # 23
# String transformations
print(f"Upper: {us.upper()}") # ' HELLO, PYTHON WORLD! '
print(f"Lower: {us.lower()}") # ' hello, python world! '
print(f"Stripped: '{us.strip()}'") # 'Hello, Python World!'
print(f"Title: {us.title()}") # ' Hello, Python World! '
# String searching
print(f"Find 'Python': {us.find('Python')}") # 9
print(f"Count 'o': {us.count('o')}") # 2
print(f"Starts with ' Hello': {us.startswith(' Hello')}") # True
print(f"Ends with '! ': {us.endswith('! ')}") # True
# String manipulation
replaced = us.replace('Python', 'Programming')
print(f"Replaced: {replaced}") # ' Hello, Programming World! '
# Split and join
words = us.strip().split(' ')
print(f"Words: {words}") # ['Hello,', 'Python', 'World!']
separator = UserString(" | ")
joined = separator.join(words)
print(f"Joined: {joined}") # 'Hello, | Python | World!'
# Formatting
template = UserString("Hello, {name}! Welcome to {place}.")
formatted = template.format(name="Alice", place="Python")
print(f"Formatted: {formatted}") # 'Hello, Alice! Welcome to Python.'
# Slicing
print(f"Slice [2:7]: '{us[2:7]}'") # 'Hello'
print(f"Step slice [::2]: '{us[::2]}'") # ' el,PtinWrd! '
# Boolean operations
print(f"Is alpha: {us.strip().replace(',', '').replace('!', '').replace(' ', '').isalpha()}")
print(f"Is digit: {UserString('12345').isdigit()}") # True
print(f"Is space: {UserString(' ').isspace()}") # True
🎯 Primary Use Cases
1. Input Sanitization and Validation
Use Case: Strings that automatically sanitize input and prevent injection attacks. Why UserString: Allows transparent string operations while enforcing security constraints.
from collections import UserString
import re
import html
class SanitizedString(UserString):
"""String that automatically sanitizes HTML and prevents XSS."""
def __init__(self, data=''):
# Sanitize the input
sanitized = self._sanitize(str(data))
super().__init__(sanitized)
def _sanitize(self, text: str) -> str:
"""Sanitize text by escaping HTML and removing dangerous content."""
# HTML escape
escaped = html.escape(text)
# Remove script tags completely
escaped = re.sub(r'<script[^>]*>.*?</script>', '', escaped, flags=re.IGNORECASE | re.DOTALL)
# Remove on* event handlers
escaped = re.sub(r'\s*on\w+\s*=\s*["\'][^"\']*["\']', '', escaped, flags=re.IGNORECASE)
# Remove javascript: URLs
escaped = re.sub(r'javascript\s*:', '', escaped, flags=re.IGNORECASE)
return escaped
def __add__(self, other):
"""Override addition to maintain sanitization."""
result = str(self.data) + str(other)
return SanitizedString(result)
def __iadd__(self, other):
"""Override in-place addition."""
self.data = self._sanitize(str(self.data) + str(other))
return self
def replace(self, old, new, count=-1):
"""Override replace to maintain sanitization."""
result = self.data.replace(old, new, count)
return SanitizedString(result)
def format(self, *args, **kwargs):
"""Override format to sanitize inserted values."""
# Sanitize all arguments
sanitized_args = [self._sanitize(str(arg)) for arg in args]
sanitized_kwargs = {k: self._sanitize(str(v)) for k, v in kwargs.items()}
result = self.data.format(*sanitized_args, **sanitized_kwargs)
return SanitizedString(result)
class ValidatedEmail(UserString):
"""String that validates email format."""
EMAIL_PATTERN = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
def __init__(self, data=''):
email = str(data).strip().lower()
if email and not self.EMAIL_PATTERN.match(email):
raise ValueError(f"Invalid email format: {email}")
super().__init__(email)
def set_domain(self, domain: str):
"""Change the domain part of the email."""
if '@' in self.data:
local_part = self.data.split('@')[0]
new_email = f"{local_part}@{domain}"
if not self.EMAIL_PATTERN.match(new_email):
raise ValueError(f"Invalid domain: {domain}")
self.data = new_email
else:
raise ValueError("Cannot set domain on invalid email")
@property
def local_part(self) -> str:
"""Get the local part (before @) of the email."""
return self.data.split('@')[0] if '@' in self.data else self.data
@property
def domain(self) -> str:
"""Get the domain part (after @) of the email."""
return self.data.split('@')[1] if '@' in self.data else ''
# Usage examples
# Sanitized string
unsafe_input = '<script>alert("XSS")</script>Hello <b>World</b>!'
safe_str = SanitizedString(unsafe_input)
print(f"Sanitized: {safe_str}") # <b>World</b>!
# Adding more unsafe content
safe_str += ' <img src="x" onerror="alert(1)">'
print(f"After addition: {safe_str}")
# Format with sanitization
template = SanitizedString("Welcome {name}!")
result = template.format(name='<script>alert("hack")</script>Alice')
print(f"Formatted safely: {result}")
# Email validation
try:
email = ValidatedEmail("user@example.com")
print(f"Valid email: {email}")
print(f"Local part: {email.local_part}")
print(f"Domain: {email.domain}")
email.set_domain("newdomain.com")
print(f"Updated email: {email}")
# This will raise an error
invalid_email = ValidatedEmail("invalid-email")
except ValueError as e:
print(f"Validation error: {e}")
2. Rich Text and Markup Strings
Use Case: Strings that maintain formatting information and support rich text operations. Why UserString: Allows string-like interface while managing additional metadata.
from collections import UserString
import re
from typing import Dict, List, Tuple
class MarkdownString(UserString):
"""String that supports Markdown formatting operations."""
def __init__(self, data=''):
super().__init__(str(data))
self._formatting_history = []
def bold(self, text: str) -> 'MarkdownString':
"""Wrap text in bold formatting."""
result = self.data.replace(text, f"**{text}**")
new_string = MarkdownString(result)
new_string._formatting_history = self._formatting_history + [('bold', text)]
return new_string
def italic(self, text: str) -> 'MarkdownString':
"""Wrap text in italic formatting."""
result = self.data.replace(text, f"*{text}*")
new_string = MarkdownString(result)
new_string._formatting_history = self._formatting_history + [('italic', text)]
return new_string
def code(self, text: str) -> 'MarkdownString':
"""Wrap text in code formatting."""
result = self.data.replace(text, f"`{text}`")
new_string = MarkdownString(result)
new_string._formatting_history = self._formatting_history + [('code', text)]
return new_string
def link(self, text: str, url: str) -> 'MarkdownString':
"""Convert text to a link."""
result = self.data.replace(text, f"[{text}]({url})")
new_string = MarkdownString(result)
new_string._formatting_history = self._formatting_history + [('link', (text, url))]
return new_string
def heading(self, level: int = 1) -> 'MarkdownString':
"""Convert to heading."""
if not 1 <= level <= 6:
raise ValueError("Heading level must be 1-6")
prefix = "#" * level + " "
result = prefix + self.data
new_string = MarkdownString(result)
new_string._formatting_history = self._formatting_history + [('heading', level)]
return new_string
def strip_formatting(self) -> 'MarkdownString':
"""Remove all Markdown formatting."""
result = self.data
# Remove bold and italic
result = re.sub(r'\*\*(.*?)\*\*', r'\1', result) # Bold
result = re.sub(r'\*(.*?)\*', r'\1', result) # Italic
result = re.sub(r'`(.*?)`', r'\1', result) # Code
result = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', result) # Links
result = re.sub(r'^#+\s*', '', result, flags=re.MULTILINE) # Headings
return MarkdownString(result)
def to_html(self) -> str:
"""Convert Markdown to basic HTML."""
result = self.data
# Convert formatting
result = re.sub(r'\*\*(.*?)\*\*', r'<strong>\1</strong>', result) # Bold
result = re.sub(r'\*(.*?)\*', r'<em>\1</em>', result) # Italic
result = re.sub(r'`(.*?)`', r'<code>\1</code>', result) # Code
result = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', r'<a href="\2">\1</a>', result) # Links
result = re.sub(r'^(#+)\s*(.*)', lambda m: f'<h{len(m.group(1))}>{m.group(2)}</h{len(m.group(1))}>',
result, flags=re.MULTILINE) # Headings
return result
def get_formatting_history(self) -> List[Tuple]:
"""Get history of formatting operations."""
return self._formatting_history.copy()
class ColoredString(UserString):
"""String with ANSI color support for terminal output."""
COLORS = {
'black': 30, 'red': 31, 'green': 32, 'yellow': 33,
'blue': 34, 'magenta': 35, 'cyan': 36, 'white': 37,
'bright_black': 90, 'bright_red': 91, 'bright_green': 92,
'bright_yellow': 93, 'bright_blue': 94, 'bright_magenta': 95,
'bright_cyan': 96, 'bright_white': 97
}
STYLES = {
'bold': 1, 'dim': 2, 'italic': 3, 'underline': 4,
'blink': 5, 'reverse': 7, 'strikethrough': 9
}
def __init__(self, data='', color=None, bg_color=None, style=None):
super().__init__(str(data))
self.color = color
self.bg_color = bg_color
self.style = style
def _apply_formatting(self) -> str:
"""Apply ANSI formatting codes."""
codes = []
if self.style and self.style in self.STYLES:
codes.append(str(self.STYLES[self.style]))
if self.color and self.color in self.COLORS:
codes.append(str(self.COLORS[self.color]))
if self.bg_color and self.bg_color in self.COLORS:
codes.append(str(self.COLORS[self.bg_color] + 10)) # Background colors are +10
if codes:
return f"\033[{';'.join(codes)}m{self.data}\033[0m"
return self.data
def __str__(self):
"""Return formatted string with ANSI codes."""
return self._apply_formatting()
def colored(self, color: str) -> 'ColoredString':
"""Return new instance with specified color."""
return ColoredString(self.data, color=color, bg_color=self.bg_color, style=self.style)
def on_color(self, bg_color: str) -> 'ColoredString':
"""Return new instance with specified background color."""
return ColoredString(self.data, color=self.color, bg_color=bg_color, style=self.style)
def styled(self, style: str) -> 'ColoredString':
"""Return new instance with specified style."""
return ColoredString(self.data, color=self.color, bg_color=self.bg_color, style=style)
def plain(self) -> str:
"""Return plain text without formatting."""
return self.data
# Usage examples
# Markdown string
md = MarkdownString("This is a sample text with Python code.")
md = md.bold("sample").italic("Python").code("code")
print(f"Markdown: {md}")
print(f"HTML: {md.to_html()}")
print(f"Plain: {md.strip_formatting()}")
print(f"History: {md.get_formatting_history()}")
# Add link and heading
md = md.link("Python", "https://python.org")
md = md.heading(2)
print(f"With link and heading: {md}")
# Colored string
colored = ColoredString("Hello, World!")
print(colored.colored('red').styled('bold'))
print(colored.colored('blue').on_color('yellow'))
print(colored.colored('green').styled('underline'))
# Chain formatting
message = ColoredString("Success!").colored('green').styled('bold')
warning = ColoredString("Warning!").colored('yellow').styled('italic')
error = ColoredString("Error!").colored('red').styled('underline')
print(f"Status messages:")
print(f" {message}")
print(f" {warning}")
print(f" {error}")
3. Template and Interpolation Strings
Use Case: Strings with advanced templating and variable interpolation capabilities. Why UserString: Provides string interface while managing template logic and variable tracking.
from collections import UserString
import re
from typing import Dict, Any, List, Optional
from string import Template
import json
class SmartTemplate(UserString):
"""Advanced template string with multiple interpolation styles."""
def __init__(self, template=''):
super().__init__(str(template))
self._variables = set()
self._parse_variables()
def _parse_variables(self):
"""Extract all variable references from the template."""
# Find ${var} style variables
dollar_vars = re.findall(r'\$\{([^}]+)\}', self.data)
# Find {var} style variables
brace_vars = re.findall(r'\{([^}]+)\}', self.data)
# Find %{var} style variables
percent_vars = re.findall(r'%\{([^}]+)\}', self.data)
self._variables = set(dollar_vars + brace_vars + percent_vars)
def get_variables(self) -> set:
"""Get all variables used in the template."""
return self._variables.copy()
def render(self, variables: Dict[str, Any],
style: str = 'auto',
missing_action: str = 'error') -> 'SmartTemplate':
"""Render template with variables.
Args:
variables: Dictionary of variable values
style: 'auto', 'dollar', 'brace', 'percent', or 'all'
missing_action: 'error', 'ignore', or 'placeholder'
"""
result = self.data
if style in ('auto', 'all', 'dollar'):
result = self._render_dollar_style(result, variables, missing_action)
if style in ('auto', 'all', 'brace'):
result = self._render_brace_style(result, variables, missing_action)
if style in ('auto', 'all', 'percent'):
result = self._render_percent_style(result, variables, missing_action)
return SmartTemplate(result)
def _render_dollar_style(self, text: str, variables: Dict[str, Any],
missing_action: str) -> str:
"""Render ${var} style variables."""
def replace_var(match):
var_name = match.group(1)
if var_name in variables:
return str(variables[var_name])
elif missing_action == 'error':
raise KeyError(f"Variable '{var_name}' not found")
elif missing_action == 'ignore':
return match.group(0)
else: # placeholder
return f"[{var_name}]"
return re.sub(r'\$\{([^}]+)\}', replace_var, text)
def _render_brace_style(self, text: str, variables: Dict[str, Any],
missing_action: str) -> str:
"""Render {var} style variables."""
try:
return text.format(**variables)
except KeyError as e:
if missing_action == 'error':
raise
elif missing_action == 'ignore':
return text
else: # placeholder
# Replace missing variables with placeholders
for var in self._variables:
if var not in variables:
text = text.replace(f'{{{var}}}', f'[{var}]')
return text.format(**{k: v for k, v in variables.items() if k in self._variables})
def _render_percent_style(self, text: str, variables: Dict[str, Any],
missing_action: str) -> str:
"""Render %{var} style variables."""
def replace_var(match):
var_name = match.group(1)
if var_name in variables:
return str(variables[var_name])
elif missing_action == 'error':
raise KeyError(f"Variable '{var_name}' not found")
elif missing_action == 'ignore':
return match.group(0)
else: # placeholder
return f"[{var_name}]"
return re.sub(r'%\{([^}]+)\}', replace_var, text)
def validate_variables(self, variables: Dict[str, Any]) -> List[str]:
"""Validate that all required variables are provided."""
missing = []
for var in self._variables:
if var not in variables:
missing.append(var)
return missing
def preview(self, variables: Dict[str, Any]) -> str:
"""Preview template rendering with placeholders for missing vars."""
try:
return str(self.render(variables, missing_action='placeholder'))
except Exception:
return f"Error previewing template: {self.data}"
class ConfigTemplate(UserString):
"""Template for configuration files with type conversion and validation."""
TYPE_CONVERTERS = {
'int': int,
'float': float,
'bool': lambda x: str(x).lower() in ('true', '1', 'yes', 'on'),
'str': str,
'json': json.loads,
'list': lambda x: x.split(',') if isinstance(x, str) else list(x)
}
def __init__(self, template=''):
super().__init__(str(template))
self._var_types = {}
self._var_defaults = {}
self._parse_typed_variables()
def _parse_typed_variables(self):
"""Parse variables with type hints: ${var:type:default}."""
pattern = r'\$\{([^:}]+)(?::([^:}]+))?(?::([^}]+))?\}'
for match in re.finditer(pattern, self.data):
var_name = match.group(1)
var_type = match.group(2) or 'str'
var_default = match.group(3)
self._var_types[var_name] = var_type
if var_default is not None:
self._var_defaults[var_name] = var_default
def render_config(self, variables: Dict[str, Any]) -> 'ConfigTemplate':
"""Render configuration with type conversion and defaults."""
final_vars = {}
# Apply defaults
for var_name, default_value in self._var_defaults.items():
if var_name not in variables:
final_vars[var_name] = default_value
else:
final_vars[var_name] = variables[var_name]
# Add provided variables
for var_name, value in variables.items():
final_vars[var_name] = value
# Convert types
for var_name, value in final_vars.items():
var_type = self._var_types.get(var_name, 'str')
if var_type in self.TYPE_CONVERTERS:
try:
final_vars[var_name] = self.TYPE_CONVERTERS[var_type](value)
except (ValueError, TypeError) as e:
raise ValueError(f"Cannot convert '{value}' to {var_type} for variable '{var_name}': {e}")
# Render template
result = self.data
pattern = r'\$\{([^:}]+)(?::([^:}]+))?(?::([^}]+))?\}'
def replace_var(match):
var_name = match.group(1)
if var_name in final_vars:
return str(final_vars[var_name])
else:
raise KeyError(f"Variable '{var_name}' not found")
result = re.sub(pattern, replace_var, result)
return ConfigTemplate(result)
def get_schema(self) -> Dict[str, Dict[str, Any]]:
"""Get schema information for all variables."""
schema = {}
for var_name in self._var_types:
schema[var_name] = {
'type': self._var_types[var_name],
'default': self._var_defaults.get(var_name),
'required': var_name not in self._var_defaults
}
return schema
# Usage examples
# Smart template with multiple styles
template_text = """
Hello ${name}!
Your account: {account_id}
Status: %{status}
Balance: $${balance:float:0.00}
"""
smart_template = SmartTemplate(template_text)
print(f"Variables found: {smart_template.get_variables()}")
variables = {
'name': 'Alice',
'account_id': '12345',
'status': 'active',
'balance': '150.75'
}
rendered = smart_template.render(variables)
print(f"Rendered: {rendered}")
# Preview with missing variables
partial_vars = {'name': 'Bob', 'status': 'pending'}
preview = smart_template.preview(partial_vars)
print(f"Preview: {preview}")
# Configuration template
config_template = ConfigTemplate("""
database:
host: ${db_host:str:localhost}
port: ${db_port:int:5432}
ssl: ${db_ssl:bool:true}
server:
workers: ${workers:int:4}
debug: ${debug:bool:false}
allowed_hosts: ${allowed_hosts:list:localhost,127.0.0.1}
""")
print(f"Config schema: {json.dumps(config_template.get_schema(), indent=2)}")
config_vars = {
'db_host': 'prod-db.example.com',
'db_port': '3306',
'workers': '8',
'allowed_hosts': 'example.com,api.example.com'
}
config = config_template.render_config(config_vars)
print(f"Rendered config:\n{config}")
🚀 Advanced Real-World Applications
1. Multi-Language String with Localization
Use Case: Strings that support multiple languages and automatic localization. Why UserString: Provides string interface while managing translation logic and locale data.
from collections import UserString
from typing import Dict, Optional, Any
import json
import os
class LocalizedString(UserString):
"""String that supports multiple languages and localization."""
# Class-level translation storage
_translations: Dict[str, Dict[str, str]] = {}
_current_locale = 'en'
_fallback_locale = 'en'
def __init__(self, key: str, default_text: str = None, **format_args):
self.key = key
self.format_args = format_args
self.default_text = default_text or key
# Get localized text
localized_text = self._get_localized_text()
super().__init__(localized_text)
def _get_localized_text(self) -> str:
"""Get text in current locale."""
# Try current locale
if (self._current_locale in self._translations and
self.key in self._translations[self._current_locale]):
text = self._translations[self._current_locale][self.key]
# Try fallback locale
elif (self._fallback_locale in self._translations and
self.key in self._translations[self._fallback_locale]):
text = self._translations[self._fallback_locale][self.key]
# Use default text
else:
text = self.default_text
# Apply formatting if provided
if self.format_args:
try:
text = text.format(**self.format_args)
except (KeyError, ValueError):
# If formatting fails, use unformatted text
pass
return text
@classmethod
def set_locale(cls, locale: str):
"""Set the current locale."""
cls._current_locale = locale
@classmethod
def load_translations(cls, locale: str, translations: Dict[str, str]):
"""Load translations for a locale."""
if locale not in cls._translations:
cls._translations[locale] = {}
cls._translations[locale].update(translations)
@classmethod
def load_from_file(cls, locale: str, filename: str):
"""Load translations from JSON file."""
try:
with open(filename, 'r', encoding='utf-8') as f:
translations = json.load(f)
cls.load_translations(locale, translations)
except FileNotFoundError:
print(f"Translation file not found: {filename}")
except json.JSONDecodeError:
print(f"Invalid JSON in translation file: {filename}")
def translate(self, locale: str = None) -> 'LocalizedString':
"""Get translation in specific locale."""
if locale:
old_locale = self._current_locale
self._current_locale = locale
result = LocalizedString(self.key, self.default_text, **self.format_args)
self._current_locale = old_locale
return result
else:
return LocalizedString(self.key, self.default_text, **self.format_args)
def format(self, **kwargs) -> 'LocalizedString':
"""Create new instance with updated format arguments."""
new_args = {**self.format_args, **kwargs}
return LocalizedString(self.key, self.default_text, **new_args)
@classmethod
def get_available_locales(cls) -> List[str]:
"""Get list of available locales."""
return list(cls._translations.keys())
@classmethod
def get_translation_stats(cls) -> Dict[str, Any]:
"""Get statistics about loaded translations."""
stats = {}
for locale, translations in cls._translations.items():
stats[locale] = {
'count': len(translations),
'keys': list(translations.keys())
}
return stats
class ValidatedURL(UserString):
"""String that validates and normalizes URLs."""
def __init__(self, url: str):
self.original_url = str(url)
normalized = self._normalize_url(self.original_url)
super().__init__(normalized)
if not self._is_valid_url(normalized):
raise ValueError(f"Invalid URL: {url}")
def _normalize_url(self, url: str) -> str:
"""Normalize URL format."""
url = url.strip()
# Add protocol if missing
if not url.startswith(('http://', 'https://', 'ftp://', 'ftps://')):
url = 'https://' + url
# Convert to lowercase (except path)
parts = url.split('/', 3)
if len(parts) >= 3:
parts[0] = parts[0].lower() # protocol
parts[2] = parts[2].lower() # domain
url = '/'.join(parts)
return url
def _is_valid_url(self, url: str) -> bool:
"""Validate URL format."""
import re
# Basic URL pattern
pattern = re.compile(
r'^https?://' # protocol
r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+[A-Z]{2,6}\.?|' # domain
r'localhost|' # localhost
r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' # IP
r'(?::\d+)?' # optional port
r'(?:/?|[/?]\S+)$', re.IGNORECASE)
return bool(pattern.match(url))
@property
def domain(self) -> str:
"""Extract domain from URL."""
import re
match = re.match(r'https?://([^/]+)', self.data)
return match.group(1) if match else ''
@property
def protocol(self) -> str:
"""Extract protocol from URL."""
return self.data.split('://')[0] if '://' in self.data else ''
@property
def path(self) -> str:
"""Extract path from URL."""
parts = self.data.split('/', 3)
return '/' + parts[3] if len(parts) > 3 else '/'
def with_path(self, path: str) -> 'ValidatedURL':
"""Create new URL with different path."""
base_url = f"{self.protocol}://{self.domain}"
if not path.startswith('/'):
path = '/' + path
return ValidatedURL(base_url + path)
def with_protocol(self, protocol: str) -> 'ValidatedURL':
"""Create new URL with different protocol."""
new_url = self.data.replace(f"{self.protocol}://", f"{protocol}://")
return ValidatedURL(new_url)
# Usage examples with sample translation files
# Load translations
LocalizedString.load_translations('en', {
'welcome_message': 'Welcome, {name}!',
'goodbye_message': 'Goodbye, {name}. See you soon!',
'error_not_found': 'Item not found',
'status_active': 'Active',
'status_inactive': 'Inactive'
})
LocalizedString.load_translations('es', {
'welcome_message': '¡Bienvenido, {name}!',
'goodbye_message': 'Adiós, {name}. ¡Hasta pronto!',
'error_not_found': 'Elemento no encontrado',
'status_active': 'Activo',
'status_inactive': 'Inactivo'
})
LocalizedString.load_translations('fr', {
'welcome_message': 'Bienvenue, {name}!',
'goodbye_message': 'Au revoir, {name}. À bientôt!',
'error_not_found': 'Élément non trouvé',
'status_active': 'Actif',
'status_inactive': 'Inactif'
})
# Use localized strings
welcome = LocalizedString('welcome_message', name='Alice')
print(f"English: {welcome}")
LocalizedString.set_locale('es')
welcome_es = welcome.translate()
print(f"Spanish: {welcome_es}")
LocalizedString.set_locale('fr')
welcome_fr = welcome.translate()
print(f"French: {welcome_fr}")
# Dynamic formatting
goodbye = LocalizedString('goodbye_message').format(name='Bob')
print(f"Goodbye in French: {goodbye}")
# URL validation
try:
url1 = ValidatedURL("example.com/path/to/page")
print(f"Normalized URL: {url1}")
print(f"Domain: {url1.domain}")
print(f"Protocol: {url1.protocol}")
print(f"Path: {url1.path}")
# Create variations
secure_url = url1.with_protocol('https')
api_url = url1.with_path('/api/v1/users')
print(f"Secure URL: {secure_url}")
print(f"API URL: {api_url}")
# This will raise an error
invalid_url = ValidatedURL("not-a-valid-url")
except ValueError as e:
print(f"URL validation error: {e}")
print(f"Available locales: {LocalizedString.get_available_locales()}")
print(f"Translation stats: {LocalizedString.get_translation_stats()}")
📊 Performance Considerations
UserString has minimal performance overhead compared to regular strings since most operations delegate to the underlying string.
import time
from collections import UserString
def benchmark_userstring_vs_str():
"""Compare UserString vs str performance."""
n = 100000
text = "Hello, World! " * 10
print("=== UserString vs str Performance ===")
# Test creation
start = time.time()
for i in range(n):
us = UserString(text)
userstring_create_time = time.time() - start
start = time.time()
for i in range(n):
s = str(text)
str_create_time = time.time() - start
print(f"UserString creation: {userstring_create_time:.4f}s")
print(f"str creation: {str_create_time:.4f}s")
print(f"Overhead: {userstring_create_time/str_create_time:.2f}x")
# Test operations
us = UserString(text)
s = str(text)
start = time.time()
for i in range(n//10):
_ = us.upper().lower().strip()
userstring_ops_time = time.time() - start
start = time.time()
for i in range(n//10):
_ = s.upper().lower().strip()
str_ops_time = time.time() - start
print(f"\nUserString operations: {userstring_ops_time:.4f}s")
print(f"str operations: {str_ops_time:.4f}s")
print(f"Overhead: {userstring_ops_time/str_ops_time:.2f}x")
benchmark_userstring_vs_str()
🎯 When to Use UserString
✅ Ideal Use Cases
- Custom String Classes - When you need to override string behavior
- Input Sanitization - Strings that automatically clean/validate input
- Rich Text/Markup - Strings with formatting or metadata
- Template Systems - Advanced string templating and interpolation
- Localization - Multi-language string support
- URL/Email Validation - Strings that enforce specific formats
- Logging/Audit - Strings that track modifications
❌ When NOT to Use UserString
- Simple String Needs - Use regular str for basic operations
- Performance Critical Code - UserString has slight overhead
- Memory Constrained - Regular str uses less memory
- No Custom Behavior - If you don't need to override methods
🔧 Key Industries and Applications
- Web Development - Input sanitization, template rendering, URL handling
- Content Management - Rich text processing, markdown handling
- Internationalization - Multi-language applications, localization
- Security - Input validation, XSS prevention, safe string handling
- Configuration Management - Template-based configuration files
- Documentation - Dynamic content generation, template systems
- API Development - Request/response formatting, data validation
📖 Additional Learning Resources
Official Python Resources
- collections module docs - Complete collections module documentation
- UserString recipes - Official examples and patterns
- String methods - Complete string method reference
Video Tutorials
- Corey Schafer - Python Collections - Comprehensive collections module tutorial
- Real Python - UserString Video - Interactive video course
Advanced Topics
- Python Patterns - Design patterns using UserString
- Effective Python - Best practices for Python development
- Text Processing - Python text processing tools
💡 Best Practices
- Override Only What You Need - Don't override methods unnecessarily
- Use Data Attribute - Access
self.datafor the underlying string - Maintain String Interface - Ensure methods return appropriate types
- Document Custom Behavior - Clearly explain any custom string behavior
- Consider Immutability - Decide whether your string should be mutable
- Validate Early - Validate input in
__init__when possible - Handle Unicode - Ensure proper Unicode support for international text
UserString is perfect when you need string-like objects with custom behavior while maintaining the familiar string interface. Its wrapper design makes it ideal for input validation, template systems, and creating domain-specific string classes.