Python Cryptographic Hashing Practice Problems & Exercises
Practice: Cryptographic Hashing
← Back to lessonEasy
Implement sha256_hex(data: str) -> str that returns the SHA-256 hex digest of a string.
import hashlib
def sha256_hex(data: str) -> str:
# return the SHA-256 hex digest of data
pass
result = sha256_hex("123")
print(f"sha256: {result}")
Solution
import hashlib
def sha256_hex(data: str) -> str:
return hashlib.sha256(data.encode("utf-8")).hexdigest()
result = sha256_hex("123")
print(f"sha256: {result}")
Key points:
- Always encode strings to bytes before hashing.
strobjects cannot be passed directly to hashlib. hexdigest()returns a lowercase hex string (64 chars for SHA-256).digest()returns raw bytes — useful when you need a compact binary representation.- SHA-256 always produces a 256-bit (32-byte / 64 hex char) output regardless of input size.
Expected Output
sha256: a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3Hints
Hint 1: Use hashlib.sha256(). Call .hexdigest() to get the hex string.
Hint 2: Encode the string to bytes before hashing: data.encode() or data.encode("utf-8").
Compare MD5 and SHA-256: show output lengths and demonstrate that MD5 is faster (but insecure for security use cases).
import hashlib
import time
data = b"benchmark data" * 1000
def bench(algo, data, iterations=10000):
# return elapsed seconds for hashing data N times
pass
md5_len = len(hashlib.md5(b"x").hexdigest())
sha256_len = len(hashlib.sha256(b"x").hexdigest())
md5_time = bench("md5", data)
sha256_time = bench("sha256", data)
print(f"MD5 length: {md5_len}")
print(f"SHA256 length: {sha256_len}")
print(f"MD5 is faster: {md5_time < sha256_time}")
Solution
import hashlib
import time
data = b"benchmark data" * 1000
def bench(algo, data, iterations=10000):
h = hashlib.new(algo)
start = time.perf_counter()
for _ in range(iterations):
hashlib.new(algo, data).digest()
return time.perf_counter() - start
md5_len = len(hashlib.md5(b"x").hexdigest())
sha256_len = len(hashlib.sha256(b"x").hexdigest())
md5_time = bench("md5", data)
sha256_time = bench("sha256", data)
print(f"MD5 length: {md5_len}")
print(f"SHA256 length: {sha256_len}")
print(f"MD5 is faster: {md5_time < sha256_time}")
Why MD5 is insecure for passwords:
- MD5 produces a 128-bit output (16 bytes / 32 hex chars) — shorter output means more collisions per bit.
- MD5 is cryptographically broken: collision attacks exist where two different inputs produce the same hash.
- MD5 is fast — which is actually a vulnerability. Fast hashing lets attackers try billions of passwords/second.
- For passwords, you want a slow algorithm:
bcrypt,argon2, orPBKDF2with a high iteration count. - MD5 is still useful for non-security purposes like checksums and cache keys.
Expected Output
MD5 length: 32
SHA256 length: 64
MD5 is faster: TrueHints
Hint 1: Use hashlib.md5() and hashlib.sha256(). Both follow the same API.
Hint 2: Use timeit or time.perf_counter() to measure wall-clock time for a large number of iterations.
Demonstrate that SHA-256 and SHA3-256 produce different outputs for the same input, and explain when to prefer each.
import hashlib
def compare_hash_families(data: str):
b = data.encode()
h256 = hashlib.sha256(b).hexdigest()
h3_256 = hashlib.sha3_256(b).hexdigest()
print(f"SHA-256: {h256[:24]} (truncated)")
print(f"SHA3-256: {h3_256[:24]} (truncated)")
print(f"Same input, different family: {h256 != h3_256}")
compare_hash_families("hello world")
Solution
import hashlib
def compare_hash_families(data: str):
b = data.encode()
h256 = hashlib.sha256(b).hexdigest()
h3_256 = hashlib.sha3_256(b).hexdigest()
print(f"SHA-256: {h256[:24]} (truncated)")
print(f"SHA3-256: {h3_256[:24]} (truncated)")
print(f"Same input, different family: {h256 != h3_256}")
compare_hash_families("hello world")
SHA-2 vs SHA-3 family:
- SHA-2 (SHA-256, SHA-512): Merkle-Damgard construction. Widely deployed, hardware-accelerated on modern CPUs. The industry default.
- SHA-3 (SHA3-256, SHA3-512): Keccak sponge construction. NIST standardized in 2015 as an algorithmic backup in case SHA-2 weaknesses are found.
- Both are secure for current threat models. Prefer SHA-256 for compatibility; prefer SHA3-256 when you want cryptographic independence from SHA-2.
hashlib.algorithms_guaranteedlists all algorithms available on every Python platform.
Expected Output
SHA-256: b94d27b9934d3e08a52e52d7da7dabfac484efe04294e576 (truncated)
SHA3-256: different value
Same input, different family: TrueHints
Hint 1: Use hashlib.sha3_256() for SHA3-256. The API is identical to hashlib.sha256().
Hint 2: SHA-3 uses the Keccak sponge construction — completely different internals from SHA-2 family.
Implement sign_message(key, message) and verify_message(key, message, mac) using HMAC-SHA256.
import hmac
import hashlib
def sign_message(key: bytes, message: str) -> str:
# return hex HMAC-SHA256 of message under key
pass
def verify_message(key: bytes, message: str, mac: str) -> bool:
# return True if mac is valid for message under key
pass
key = b"super-secret-key"
msg = "Transfer $100 to Alice"
mac = sign_message(key, msg)
print(f"HMAC: {mac[:5]}...")
print(f"Verification: {verify_message(key, msg, mac)}")
print(f"Tampered verification: {verify_message(key, 'Transfer $999 to Eve', mac)}")
Solution
import hmac
import hashlib
def sign_message(key: bytes, message: str) -> str:
return hmac.new(key, message.encode(), digestmod=hashlib.sha256).hexdigest()
def verify_message(key: bytes, message: str, mac: str) -> bool:
expected = sign_message(key, message)
return hmac.compare_digest(expected, mac)
key = b"super-secret-key"
msg = "Transfer $100 to Alice"
mac = sign_message(key, msg)
print(f"HMAC: {mac[:5]}...")
print(f"Verification: {verify_message(key, msg, mac)}")
print(f"Tampered verification: {verify_message(key, 'Transfer $999 to Eve', mac)}")
HMAC vs plain hash:
- A plain hash
sha256(message)can be forged — anyone can compute it. - HMAC requires knowledge of the key:
HMAC(key, message) = H(key XOR opad || H(key XOR ipad || message)). - The
hmac.compare_digest()function performs constant-time comparison to prevent timing attacks. - Never compare MACs with
==— usehmac.compare_digest()always.
Expected Output
HMAC: 3d70e...
Verification: True
Tampered verification: FalseHints
Hint 1: Use hmac.new(key, msg, digestmod=hashlib.sha256).hexdigest().
Hint 2: Both key and message must be bytes. Encode strings with .encode().
Medium
Implement a secure hash_password(password) and verify_password(password, stored) using PBKDF2-HMAC-SHA256 with a random salt.
import hashlib
import secrets
def hash_password(password: str) -> str:
# Generate salt, derive key with PBKDF2, return "salt_hex:hash_hex"
pass
def verify_password(password: str, stored: str) -> bool:
# Re-derive hash from stored salt and compare safely
pass
stored = hash_password("correct-horse-battery")
stored2 = hash_password("correct-horse-battery")
print(f"Stored: {stored[:20]}... (salt:hash format)")
print(f"Same password verifies: {verify_password('correct-horse-battery', stored)}")
print(f"Different password fails: {verify_password('wrong-password', stored)}")
print(f"Two hashes for same password differ: {stored != stored2}")
Solution
import hashlib
import secrets
import hmac
def hash_password(password: str) -> str:
salt = secrets.token_bytes(16)
key = hashlib.pbkdf2_hmac(
"sha256",
password.encode("utf-8"),
salt,
iterations=260000,
)
return salt.hex() + ":" + key.hex()
def verify_password(password: str, stored: str) -> bool:
salt_hex, key_hex = stored.split(":", 1)
salt = bytes.fromhex(salt_hex)
key = hashlib.pbkdf2_hmac(
"sha256",
password.encode("utf-8"),
salt,
iterations=260000,
)
return hmac.compare_digest(key.hex(), key_hex)
stored = hash_password("correct-horse-battery")
stored2 = hash_password("correct-horse-battery")
print(f"Stored: {stored[:20]}... (salt:hash format)")
print(f"Same password verifies: {verify_password('correct-horse-battery', stored)}")
print(f"Different password fails: {verify_password('wrong-password', stored)}")
print(f"Two hashes for same password differ: {stored != stored2}")
Why each piece matters:
- Salt: A random per-password salt ensures two identical passwords produce different stored hashes, defeating rainbow table attacks.
- PBKDF2 iterations: 260,000 iterations (NIST 2023 recommendation) makes each hash attempt slow — an attacker GPU gets ~thousands of tries/second instead of billions.
- compare_digest: Prevents timing attacks where an attacker measures response time to learn prefix matches.
- Production note: Prefer
bcrypt(viapasslib) orargon2-cffi— they handle versioning and parameter upgrades automatically.
Expected Output
Stored: salt:hash format
Same password verifies: True
Different password fails: False
Two hashes for same password differ: TrueHints
Hint 1: Generate a random salt using secrets.token_bytes(16) — never use a fixed or predictable salt.
Hint 2: Use hashlib.pbkdf2_hmac("sha256", password.encode(), salt, iterations=260000).
Hint 3: Store salt + hash together, separated by a delimiter, so you can re-derive the hash for verification.
Demonstrate the timing difference between naive == comparison and hmac.compare_digest, and implement a safe_compare(a, b) function.
import hmac
import time
TOKEN = "a" * 64 # 64-char token
def safe_compare(a: str, b: str) -> bool:
# constant-time comparison
pass
def time_compare(fn, a, b, iterations=100000):
start = time.perf_counter_ns()
for _ in range(iterations):
fn(a, b)
return (time.perf_counter_ns() - start) // iterations
wrong_first_char = "b" + "a" * 63 # differs at position 0
wrong_last_char = "a" * 63 + "b" # differs at position 63
t_naive_first = time_compare(lambda a, b: a == b, TOKEN, wrong_first_char)
t_naive_last = time_compare(lambda a, b: a == b, TOKEN, wrong_last_char)
t_safe_first = time_compare(safe_compare, TOKEN, wrong_first_char)
t_safe_last = time_compare(safe_compare, TOKEN, wrong_last_char)
print(f"Unsafe equal (timing leak): {TOKEN == TOKEN}")
print(f"Safe equal (constant time): {safe_compare(TOKEN, TOKEN)}")
print(f"Naive ratio last/first: {t_naive_last / max(t_naive_first, 1):.2f}x")
print(f"Safe ratio last/first: {t_safe_last / max(t_safe_first, 1):.2f}x (should be ~1.0)")
Solution
import hmac
import time
TOKEN = "a" * 64
def safe_compare(a: str, b: str) -> bool:
return hmac.compare_digest(a, b)
def time_compare(fn, a, b, iterations=100000):
start = time.perf_counter_ns()
for _ in range(iterations):
fn(a, b)
return (time.perf_counter_ns() - start) // iterations
wrong_first_char = "b" + "a" * 63
wrong_last_char = "a" * 63 + "b"
t_naive_first = time_compare(lambda a, b: a == b, TOKEN, wrong_first_char)
t_naive_last = time_compare(lambda a, b: a == b, TOKEN, wrong_last_char)
t_safe_first = time_compare(safe_compare, TOKEN, wrong_first_char)
t_safe_last = time_compare(safe_compare, TOKEN, wrong_last_char)
print(f"Unsafe equal (timing leak): {TOKEN == TOKEN}")
print(f"Safe equal (constant time): {safe_compare(TOKEN, TOKEN)}")
print(f"Naive ratio last/first: {t_naive_last / max(t_naive_first, 1):.2f}x")
print(f"Safe ratio last/first: {t_safe_last / max(t_safe_first, 1):.2f}x (should be ~1.0)")
The timing attack:
- A naive
==comparison exits as soon as a mismatch is found — comparing two strings that match in the first 63 characters takes longer than comparing strings that differ in character 0. - An attacker submitting many guesses can measure response times and learn how many leading characters are correct — effectively guessing a secret one character at a time.
hmac.compare_digest()XORs all bytes and checks the result, ensuring consistent timing regardless of where the strings differ.- Always use
hmac.compare_digest()for API tokens, session tokens, HMAC tags, and any secret comparison.
Expected Output
Unsafe equal (timing leak): True
Safe equal (constant time): True
Timing difference ratio (naive vs constant): variesHints
Hint 1: Naive string comparison (`==`) exits early on the first mismatch — attackers can measure this.
Hint 2: hmac.compare_digest() always takes the same time regardless of where strings differ.
Hint 3: Use time.perf_counter_ns() for nanosecond-precision timing.
Build a file integrity checker that creates a SHA-256 manifest and verifies files against it. Use in-memory bytes to simulate files.
import hashlib
import hmac
import json
FILES = {
"test1.txt": b"Hello, world!",
"test2.txt": b"Secret config data",
"test3.txt": b"Public content here",
}
def create_manifest(files: dict) -> dict:
# return {filename: sha256_hex} mapping
pass
def verify_manifest(files: dict, manifest: dict) -> None:
# print OK or FAILED for each file
pass
manifest = create_manifest(FILES)
print(f"Manifest created with {len(manifest)} entries")
verify_manifest(FILES, manifest)
print(f"All files OK: True")
# Simulate tampering
tampered = dict(FILES)
tampered["test2.txt"] = b"TAMPERED CONTENT"
print("Tampered file detected:", end=" ")
verify_manifest(tampered, manifest)
Solution
import hashlib
import hmac
def create_manifest(files: dict) -> dict:
return {
name: hashlib.sha256(content).hexdigest()
for name, content in files.items()
}
def verify_manifest(files: dict, manifest: dict) -> None:
all_ok = True
for name, stored_hash in manifest.items():
content = files.get(name)
if content is None:
print(f"integrity_check FAILED for {name} (missing)")
all_ok = False
continue
actual_hash = hashlib.sha256(content).hexdigest()
if not hmac.compare_digest(actual_hash, stored_hash):
print(f"integrity_check FAILED for {name}")
all_ok = False
if all_ok:
print("All files verified OK")
FILES = {
"test1.txt": b"Hello, world!",
"test2.txt": b"Secret config data",
"test3.txt": b"Public content here",
}
manifest = create_manifest(FILES)
print(f"Manifest created with {len(manifest)} entries")
verify_manifest(FILES, manifest)
tampered = dict(FILES)
tampered["test2.txt"] = b"TAMPERED CONTENT"
print("Tampered file detected:", end=" ")
verify_manifest(tampered, manifest)
Real-world applications:
- Package managers (pip, apt) verify downloaded packages against published SHA-256 checksums.
- Docker image layers are content-addressed by SHA-256 digest.
- Git object IDs are SHA-1 hashes of content (transitioning to SHA-256).
- Always use
hmac.compare_digest()even for file hashes — consistent timing prevents oracle attacks.
Expected Output
Manifest created with 3 entries
All files OK: True
Tampered file detected: integrity_check FAILED for test2.txtHints
Hint 1: Build a dict mapping filename to sha256_hex(content). Store as JSON.
Hint 2: On verify, re-hash each file and compare with stored hash using hmac.compare_digest().
Hint 3: Use a try/except to handle missing files — a deleted file is also a failed integrity check.
Implement a simple content-addressable storage (CAS) system where the storage key is the SHA-256 hash of the content.
import hashlib
class ContentAddressableStore:
def __init__(self):
self._store = {}
def put(self, data: bytes) -> str:
# store data and return its sha256 hex address
pass
def get(self, address: str) -> bytes:
# retrieve data by address; raise KeyError if not found
pass
@property
def size(self):
return len(self._store)
cas = ContentAddressableStore()
addr1 = cas.put(b"Hello world")
addr2 = cas.put(b"Goodbye world")
addr3 = cas.put(b"Hello world") # duplicate
print(f"Stored 3 blobs, {cas.size} unique (dedup saved {3 - cas.size})")
print(f"Retrieved: {cas.get(addr1).decode()}")
print(f"Same content, same address: {addr1 == addr3}")
Solution
import hashlib
class ContentAddressableStore:
def __init__(self):
self._store = {}
def put(self, data: bytes) -> str:
address = hashlib.sha256(data).hexdigest()
self._store[address] = data
return address
def get(self, address: str) -> bytes:
if address not in self._store:
raise KeyError(f"No blob at address {address[:8]}...")
return self._store[address]
@property
def size(self):
return len(self._store)
cas = ContentAddressableStore()
addr1 = cas.put(b"Hello world")
addr2 = cas.put(b"Goodbye world")
addr3 = cas.put(b"Hello world")
print(f"Stored 3 blobs, {cas.size} unique (dedup saved {3 - cas.size})")
print(f"Retrieved: {cas.get(addr1).decode()}")
print(f"Same content, same address: {addr1 == addr3}")
CAS in the real world:
- Git: Every blob, tree, and commit is stored at its SHA-1 (moving to SHA-256) hash.
- IPFS: Distributed filesystem where every file is addressed by its CID (content identifier).
- Docker: Image layers stored by their SHA-256 digest — identical layers shared across images.
- Benefits: Automatic deduplication, tamper detection (content matches its address), and natural immutability.
Expected Output
Stored 3 blobs, 2 unique (dedup saved 1)
Retrieved: Hello world
Same content, same address: TrueHints
Hint 1: The key of a content-addressable store is the SHA-256 hash of the content.
Hint 2: Identical content always maps to the same hash — deduplication is automatic.
Hint 3: A CAS put() returns the hash; get(hash) retrieves the content.
Hard
Implement a Merkle tree with root computation and proof generation/verification.
import hashlib
from typing import List, Tuple
def hash_leaf(data: bytes) -> str:
return hashlib.sha256(data).hexdigest()
def hash_pair(left: str, right: str) -> str:
return hashlib.sha256((left + right).encode()).hexdigest()
def build_merkle_tree(leaves: List[bytes]) -> Tuple[str, List[List[str]]]:
# returns (root_hash, levels) where levels[0] = leaf hashes
pass
def get_proof(levels: List[List[str]], leaf_index: int) -> List[Tuple[str, str]]:
# returns list of (sibling_hash, "left"/"right") for proof
pass
def verify_proof(leaf_data: bytes, proof: List[Tuple[str, str]], root: str) -> bool:
# returns True if proof is valid
pass
data = [b"tx1", b"tx2", b"tx3", b"tx4", b"tx5", b"tx6", b"tx7", b"tx8"]
root, levels = build_merkle_tree(data)
print(f"Root: {root}")
proof = get_proof(levels, 2)
print(f"Proof for leaf 2 has {len(proof)} nodes")
print(f"Proof verified: {verify_proof(b'tx3', proof, root)}")
print(f"Tampered proof: {verify_proof(b'tx3_tampered', proof, root)}")
Solution
import hashlib
from typing import List, Tuple
def hash_leaf(data: bytes) -> str:
return hashlib.sha256(data).hexdigest()
def hash_pair(left: str, right: str) -> str:
return hashlib.sha256((left + right).encode()).hexdigest()
def build_merkle_tree(leaves: List[bytes]) -> Tuple[str, List[List[str]]]:
level = [hash_leaf(leaf) for leaf in leaves]
levels = [level]
while len(level) > 1:
if len(level) % 2 == 1:
level = level + [level[-1]] # duplicate last node if odd
next_level = [
hash_pair(level[i], level[i + 1])
for i in range(0, len(level), 2)
]
levels.append(next_level)
level = next_level
return level[0], levels
def get_proof(levels: List[List[str]], leaf_index: int) -> List[Tuple[str, str]]:
proof = []
idx = leaf_index
for level in levels[:-1]:
if len(level) % 2 == 1:
level = level + [level[-1]]
if idx % 2 == 0:
sibling = level[idx + 1]
proof.append((sibling, "right"))
else:
sibling = level[idx - 1]
proof.append((sibling, "left"))
idx //= 2
return proof
def verify_proof(leaf_data: bytes, proof: List[Tuple[str, str]], root: str) -> bool:
current = hash_leaf(leaf_data)
for sibling, position in proof:
if position == "right":
current = hash_pair(current, sibling)
else:
current = hash_pair(sibling, current)
return current == root
data = [b"tx1", b"tx2", b"tx3", b"tx4", b"tx5", b"tx6", b"tx7", b"tx8"]
root, levels = build_merkle_tree(data)
print(f"Root: {root}")
proof = get_proof(levels, 2)
print(f"Proof for leaf 2 has {len(proof)} nodes")
print(f"Proof verified: {verify_proof(b'tx3', proof, root)}")
print(f"Tampered proof: {verify_proof(b'tx3_tampered', proof, root)}")
Merkle tree properties:
- Efficient proof: Proving inclusion of one leaf in a tree of N leaves requires only log2(N) hashes.
- Tamper detection: Changing any leaf invalidates its entire path to the root.
- Used in: Bitcoin/Ethereum transaction trees, Certificate Transparency logs, Git pack indexes, and IPFS DAGs.
- Odd-count handling: Duplicating the last node is the Bitcoin convention. Other implementations may use a zero-filled sibling.
Expected Output
Root: <64 hex chars>
Proof for leaf 2 has 3 nodes
Proof verified: True
Tampered proof: FalseHints
Hint 1: A Merkle tree hashes pairs of nodes bottom-up. If the count is odd, duplicate the last node.
Hint 2: Each internal node is sha256(left_child_hash + right_child_hash).
Hint 3: A Merkle proof is the sibling hashes along the path from leaf to root.
Hint 4: To verify a proof, re-derive the root hash by combining the target leaf with each proof node.
Implement a versioned password hashing system that supports parameter migration from v1 (100,000 iterations) to v2 (260,000 iterations).
import hashlib
import secrets
import hmac
CURRENT_VERSION = "v2"
ITERATIONS = {"v1": 100_000, "v2": 260_000}
def hash_password(password: str, version: str = CURRENT_VERSION) -> str:
# format: "version:iterations:salt_hex:hash_hex"
pass
def verify_password(password: str, stored: str) -> bool:
pass
def needs_upgrade(stored: str) -> bool:
pass
v1_hash = hash_password("mypassword", version="v1")
v2_hash = hash_password("mypassword", version="v2")
print("v1 hash created")
print("v2 hash created (higher iterations)")
print(f"v1 verifies: {verify_password('mypassword', v1_hash)}")
print(f"v2 verifies: {verify_password('mypassword', v2_hash)}")
print(f"v1 needs upgrade: {needs_upgrade(v1_hash)}")
print(f"v2 needs upgrade: {needs_upgrade(v2_hash)}")
Solution
import hashlib
import secrets
import hmac
CURRENT_VERSION = "v2"
ITERATIONS = {"v1": 100_000, "v2": 260_000}
def hash_password(password: str, version: str = CURRENT_VERSION) -> str:
iterations = ITERATIONS[version]
salt = secrets.token_bytes(16)
key = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, iterations)
return f"{version}:{iterations}:{salt.hex()}:{key.hex()}"
def verify_password(password: str, stored: str) -> bool:
parts = stored.split(":")
version, iterations_str, salt_hex, key_hex = parts[0], parts[1], parts[2], parts[3]
iterations = int(iterations_str)
salt = bytes.fromhex(salt_hex)
key = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, iterations)
return hmac.compare_digest(key.hex(), key_hex)
def needs_upgrade(stored: str) -> bool:
version = stored.split(":")[0]
return version != CURRENT_VERSION
v1_hash = hash_password("mypassword", version="v1")
v2_hash = hash_password("mypassword", version="v2")
print("v1 hash created")
print("v2 hash created (higher iterations)")
print(f"v1 verifies: {verify_password('mypassword', v1_hash)}")
print(f"v2 verifies: {verify_password('mypassword', v2_hash)}")
print(f"v1 needs upgrade: {needs_upgrade(v1_hash)}")
print(f"v2 needs upgrade: {needs_upgrade(v2_hash)}")
Migration strategy in production:
- On successful login, call
needs_upgrade(stored_hash). - If True, re-hash the plaintext password (which you have at login time) with new parameters.
- Update the stored hash in the database atomically.
- This gives you rolling upgrade — no forced password resets required.
- Never store parameters outside the hash string — tight coupling between parameters and hash prevents mismatch bugs.
Expected Output
v1 hash created
v2 hash created (higher iterations)
v1 verifies: True
v2 verifies: True
v1 needs upgrade: True
v2 needs upgrade: FalseHints
Hint 1: Store the iteration count alongside the salt and hash so you can upgrade parameters later.
Hint 2: A versioned format like "v2:iterations:salt_hex:hash_hex" lets you support multiple versions.
Hint 3: On successful login, check if the stored hash uses old parameters and re-hash transparently.
Implement a signed URL system where URLs expire after a set duration and cannot be tampered with.
import hmac
import hashlib
import time
from urllib.parse import urlencode, parse_qs, urlparse
SECRET_KEY = b"my-signing-secret"
def sign_url(resource: str, ttl_seconds: int = 3600) -> str:
# return url with ?expires=...&sig=... query params
pass
def verify_url(url: str) -> bool:
# return True only if signature valid AND not expired
pass
url = sign_url("/api/files/report.pdf", ttl_seconds=3600)
print("Signed URL generated")
print(f"Valid URL: {verify_url(url)}")
expired_url = sign_url("/api/files/report.pdf", ttl_seconds=-1)
print(f"Expired URL: {verify_url(expired_url)}")
tampered = url.replace("report.pdf", "admin-data.csv")
print(f"Tampered URL: {verify_url(tampered)}")
Solution
import hmac
import hashlib
import time
from urllib.parse import urlencode, parse_qs
SECRET_KEY = b"my-signing-secret"
def _compute_sig(resource: str, expires: int) -> str:
message = f"{resource}:{expires}".encode()
return hmac.new(SECRET_KEY, message, digestmod=hashlib.sha256).hexdigest()
def sign_url(resource: str, ttl_seconds: int = 3600) -> str:
expires = int(time.time()) + ttl_seconds
sig = _compute_sig(resource, expires)
params = urlencode({"expires": expires, "sig": sig})
return f"{resource}?{params}"
def verify_url(url: str) -> bool:
if "?" not in url:
return False
resource, qs = url.split("?", 1)
params = parse_qs(qs)
try:
expires = int(params["expires"][0])
sig = params["sig"][0]
except (KeyError, ValueError, IndexError):
return False
if time.time() > expires:
return False
expected_sig = _compute_sig(resource, expires)
return hmac.compare_digest(expected_sig, sig)
url = sign_url("/api/files/report.pdf", ttl_seconds=3600)
print("Signed URL generated")
print(f"Valid URL: {verify_url(url)}")
expired_url = sign_url("/api/files/report.pdf", ttl_seconds=-1)
print(f"Expired URL: {verify_url(expired_url)}")
tampered = url.replace("report.pdf", "admin-data.csv")
print(f"Tampered URL: {verify_url(tampered)}")
Design decisions:
- Sign resource + expires together: Signing only the expiry allows reusing a token for a different resource. Always include the full resource path in the signed message.
- Check expiry before HMAC: Fail fast for expired tokens — this is not a security issue since you always verify the HMAC too.
- compare_digest: Constant-time comparison prevents timing oracle attacks on the signature.
- Real-world use: AWS S3 presigned URLs, Cloudflare signed URLs, and Django's
TimestampSignerall follow this same pattern.
Expected Output
Signed URL generated
Valid URL: True
Expired URL: False
Tampered URL: FalseHints
Hint 1: A signed URL embeds the expiry timestamp in the URL itself, then signs the whole thing with HMAC.
Hint 2: The signature covers resource + expiry to prevent substitution attacks (reusing a token for a different resource).
Hint 3: On verify, check expiry first (fast fail), then verify HMAC with compare_digest.
