Python JSON Handling Practice Problems & Exercises
Practice: JSON Handling
← Back to lessonEasy
Predict the output. A Python dict is serialized to a JSON string and then deserialized back. Watch the type conversions.
import json
data = {
"name": "Alice",
"age": 30,
"active": True,
"score": None,
}
json_string = json.dumps(data)
print(json_string)
print(type(json_string))
restored = json.loads(json_string)
print(restored["name"])
print(type(restored))
print(restored["active"])
print(restored["score"])Solution
import json
data = {
"name": "Alice",
"age": 30,
"active": True,
"score": None,
}
json_string = json.dumps(data)
print(json_string)
print(type(json_string))
restored = json.loads(json_string)
print(restored["name"])
print(type(restored))
print(restored["active"])
print(restored["score"])
Output:
{"name": "Alice", "age": 30, "active": true, "score": null}
<class 'str'>
Alice
<class 'dict'>
True
None
How it works: json.dumps() serializes the dict to a JSON string. Notice the automatic type conversions: Python True becomes JSON true (lowercase), and Python None becomes JSON null. json.loads() reverses the process — it parses the JSON string and reconstructs the Python dict, converting true back to True and null back to None.
Key insight: json.dumps() always returns a str. json.loads() always returns a Python object (dict, list, str, int, float, bool, or None) depending on the top-level JSON value. The six JSON types map directly to six Python types: object→dict, array→list, string→str, number→int or float, boolean→bool, null→None.
Expected Output
{"name": "Alice", "age": 30, "active": true, "score": null}
<class 'str'>
Alice
<class 'dict'>
True
NoneHints
Hint 1: json.dumps() converts a Python object to a JSON string.
Hint 2: json.loads() converts a JSON string back to a Python object.
Hint 3: Python True becomes JSON true, and Python None becomes JSON null.
Predict the output. Compare the length of compact JSON with the formatted version, and observe alphabetical key ordering.
import json
data = {"zebra": 1, "apple": 2, "mango": 3}
compact = json.dumps(data)
print(len(compact))
formatted = json.dumps(data, indent=2, sort_keys=True)
print(formatted)Solution
import json
data = {"zebra": 1, "apple": 2, "mango": 3}
compact = json.dumps(data)
print(len(compact))
formatted = json.dumps(data, indent=2, sort_keys=True)
print(formatted)
Output:
38
{
"apple": 2,
"mango": 3,
"zebra": 1
}
How it works: The compact string {"zebra": 1, "apple": 2, "mango": 3} is 38 characters long (including braces, quotes, colons, and commas with their default spaces). With indent=2, json.dumps adds newlines and 2-space indentation for each level. With sort_keys=True, keys are reordered alphabetically: apple, mango, zebra.
Key insight: Use indent=2 when writing config files, log files, or API responses intended for human review. Use sort_keys=True when you need deterministic output — for example, when hashing JSON for checksums or caching keys. If you combine both with separators=(',', ':') you get sorted compact output, which is ideal for canonical JSON representations.
Expected Output
38\n{\n "apple": 2,\n "mango": 3,\n "zebra": 1\n}Hints
Hint 1: json.dumps(data, indent=2) adds newlines and indentation for human-readable output.
Hint 2: json.dumps(data, sort_keys=True) sorts dictionary keys alphabetically.
Hint 3: len() on the compact string gives you the byte count.
Predict the output. A config dict is written to a temporary file with json.dump() and read back with json.load().
import json
import tempfile
import os
config = {
"database": {
"host": "localhost",
"port": 5432,
},
"debug": False,
}
with tempfile.NamedTemporaryFile(mode="w", suffix=".json",
encoding="utf-8", delete=False) as f:
json.dump(config, f, indent=2)
tmp_path = f.name
with open(tmp_path, "r", encoding="utf-8") as f:
loaded = json.load(f)
print(loaded["database"]["host"])
print(loaded["database"]["port"])
print(loaded["debug"])
print(type(loaded))
os.unlink(tmp_path)Solution
import json
import tempfile
import os
config = {
"database": {
"host": "localhost",
"port": 5432,
},
"debug": False,
}
with tempfile.NamedTemporaryFile(mode="w", suffix=".json",
encoding="utf-8", delete=False) as f:
json.dump(config, f, indent=2)
tmp_path = f.name
with open(tmp_path, "r", encoding="utf-8") as f:
loaded = json.load(f)
print(loaded["database"]["host"])
print(loaded["database"]["port"])
print(loaded["debug"])
print(type(loaded))
os.unlink(tmp_path)
Output:
localhost
5432
False
<class 'dict'>
How it works: json.dump(config, f, indent=2) writes the config dict as formatted JSON to the file object f. The nested "database" dict is preserved as a nested JSON object. After writing and closing, json.load(f) reads the file back and reconstructs the full Python dict. The integer 5432 stays as an integer, and False stays as a bool.
Key insight: json.dump() and json.load() are the file-based counterparts of json.dumps() and json.loads(). The difference is just the target: string vs file object. Always specify encoding="utf-8" — it is the standard encoding for JSON (RFC 8259), and omitting it uses the platform default which can differ on Windows.
Expected Output
localhost\n5432\nFalse\n<class 'dict'>Hints
Hint 1: json.dump(obj, file) writes JSON directly to a file object.
Hint 2: json.load(file) reads and parses JSON from a file object.
Hint 3: Always open JSON files with encoding="utf-8".
Predict the output. Compare the byte length of compact JSON (no whitespace) versus default JSON (spaces after separators).
import json
data = {"event": "click", "x": 100, "y": 200}
compact = json.dumps(data, separators=(",", ":"))
default = json.dumps(data)
# Compact is always shorter than or equal to default
print(len(compact) < len(default))
print(len(compact))
print(len(default))Solution
import json
data = {"event": "click", "x": 100, "y": 200}
compact = json.dumps(data, separators=(",", ":"))
default = json.dumps(data)
print(len(compact) < len(default))
print(len(compact))
print(len(default))
Output:
True
34
36
How it works: The default separators are (", ", ": ") — a comma followed by a space between items, and a colon followed by a space between key and value. This produces {"event": "click", "x": 100, "y": 200} (36 characters). With separators=(",", ":"), all whitespace is removed: {"event":"click","x":100,"y":200} (34 characters).
Key insight: The 2-character savings here (about 5%) scales significantly in high-throughput systems. A service processing 100,000 events per second with 200-byte payloads saves ~10MB/s by switching to compact separators. Use compact separators for network APIs, message queues, and high-throughput logging. Use indent=2 for config files and debugging output that humans will read.
Expected Output
True\n34\n36Hints
Hint 1: The default separators are (", ", ": ") — notice the spaces after comma and colon.
Hint 2: separators=(",", ":") removes all whitespace from the JSON output.
Hint 3: Compact output is always smaller or equal in size to the default output.
Medium
Fix the serialization errors. The four assertions verify that each type was correctly converted before calling json.dumps().
import json
from datetime import datetime
from decimal import Decimal
import uuid
# Each of these would raise TypeError if passed directly to json.dumps
ts = datetime(2024, 1, 15, 14, 30, 0)
balance = Decimal("99.99")
user_id = uuid.UUID("12345678-1234-5678-1234-567812345678")
tags = {"python", "backend"}
data = {
"timestamp": ts.isoformat(),
"balance": str(balance),
"user_id": str(user_id),
"tags": sorted(tags),
}
result = json.dumps(data)
loaded = json.loads(result)
print(loaded["timestamp"] == "2024-01-15T14:30:00")
print(loaded["balance"] == "99.99")
print(loaded["user_id"] == "12345678-1234-5678-1234-567812345678")
print(isinstance(loaded["tags"], list))Solution
import json
from datetime import datetime
from decimal import Decimal
import uuid
ts = datetime(2024, 1, 15, 14, 30, 0)
balance = Decimal("99.99")
user_id = uuid.UUID("12345678-1234-5678-1234-567812345678")
tags = {"python", "backend"}
data = {
"timestamp": ts.isoformat(),
"balance": str(balance),
"user_id": str(user_id),
"tags": sorted(tags),
}
result = json.dumps(data)
loaded = json.loads(result)
print(loaded["timestamp"] == "2024-01-15T14:30:00")
print(loaded["balance"] == "99.99")
print(loaded["user_id"] == "12345678-1234-5678-1234-567812345678")
print(isinstance(loaded["tags"], list))
Output:
True
True
True
True
How it works: The json module only handles six types: dict, list, str, int/float, bool, and None. Everything else raises TypeError. Manual conversion before serialization is the simplest fix: datetime.isoformat() produces an ISO 8601 string, str(Decimal(...)) preserves exact decimal representation, str(uuid_obj) produces the standard hyphenated UUID string, and sorted(set) converts a set to a deterministically-ordered list.
Key insight: Converting Decimal("99.99") to float would introduce floating-point representation errors — float(Decimal("99.99")) is 99.99000000000000198951966012828052043914794921875. Always use str() for Decimal values in financial data. For uuid and datetime, the string representation is canonical and widely understood across languages.
Expected Output
True\nTrue\nTrue\nTrueHints
Hint 1: datetime objects must be converted to strings: datetime.now().isoformat().
Hint 2: Decimal objects can be converted to float (loses precision) or str (preserves precision).
Hint 3: uuid.UUID objects convert to str via str(uuid_obj).
Hint 4: set objects must be converted to list before serialization.
Implement a custom encoder that handles datetime, Decimal, and set objects without manual pre-conversion.
import json
from datetime import datetime
from decimal import Decimal
class AppEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
if isinstance(obj, Decimal):
return str(obj)
if isinstance(obj, set):
return sorted(obj)
return super().default(obj)
data = {
"ts": datetime(2024, 6, 1, 12, 0, 0),
"price": Decimal("49.99"),
"roles": {"admin", "user"},
"count": 42,
}
result = json.dumps(data, cls=AppEncoder)
loaded = json.loads(result)
print(loaded["ts"] == "2024-06-01T12:00:00")
print(loaded["price"] == "49.99")
print(isinstance(loaded["roles"], list))
print(loaded["count"] == 42)Solution
import json
from datetime import datetime
from decimal import Decimal
class AppEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
if isinstance(obj, Decimal):
return str(obj)
if isinstance(obj, set):
return sorted(obj)
return super().default(obj)
data = {
"ts": datetime(2024, 6, 1, 12, 0, 0),
"price": Decimal("49.99"),
"roles": {"admin", "user"},
"count": 42,
}
result = json.dumps(data, cls=AppEncoder)
loaded = json.loads(result)
print(loaded["ts"] == "2024-06-01T12:00:00")
print(loaded["price"] == "49.99")
print(isinstance(loaded["roles"], list))
print(loaded["count"] == 42)
Output:
True
True
True
True
How it works: json.JSONEncoder.default() is called only when the standard encoder encounters a type it cannot handle (anything other than dict, list, str, int, float, bool, None). For datetime, we return isoformat(). For Decimal, we return str(). For set, we return a sorted list for deterministic output. Standard types like int (count=42) pass through without hitting default() at all.
Key insight: The cls=AppEncoder pattern is the production-grade approach for systematic handling of special types across an entire application. Define it once, use it everywhere. The alternative — default=some_function — works for simple cases but requires a free function. The class approach lets you compose encoders via inheritance and is the standard pattern in frameworks like Django REST Framework and FastAPI.
Expected Output
True\nTrue\nTrue\nTrueHints
Hint 1: Subclass json.JSONEncoder and override the default() method.
Hint 2: The default() method is called for every object the standard encoder cannot handle.
Hint 3: Pass the custom encoder class to json.dumps() via the cls= parameter.
Hint 4: Always call super().default(obj) at the end for unknown types — it raises TypeError.
Implement an object_hook that restores datetime and Decimal objects using the __type__ marker pattern from the lesson.
import json
from datetime import datetime
from decimal import Decimal
def encode_extended(obj):
if isinstance(obj, datetime):
return {"__type__": "datetime", "value": obj.isoformat()}
if isinstance(obj, Decimal):
return {"__type__": "decimal", "value": str(obj)}
raise TypeError(f"Not serializable: {type(obj).__name__}")
def decode_extended(obj):
if "__type__" not in obj:
return obj
t = obj["__type__"]
v = obj["value"]
if t == "datetime":
return datetime.fromisoformat(v)
if t == "decimal":
return Decimal(v)
return obj
original = {
"event": "purchase",
"timestamp": datetime(2024, 3, 15, 10, 0, 0),
"amount": Decimal("199.99"),
}
encoded = json.dumps(original, default=encode_extended)
restored = json.loads(encoded, object_hook=decode_extended)
print(restored["event"] == "purchase")
print(type(restored["timestamp"]) is datetime)
print(restored["timestamp"] == datetime(2024, 3, 15, 10, 0, 0))
print(restored["amount"] == Decimal("199.99"))Solution
import json
from datetime import datetime
from decimal import Decimal
def encode_extended(obj):
if isinstance(obj, datetime):
return {"__type__": "datetime", "value": obj.isoformat()}
if isinstance(obj, Decimal):
return {"__type__": "decimal", "value": str(obj)}
raise TypeError(f"Not serializable: {type(obj).__name__}")
def decode_extended(obj):
if "__type__" not in obj:
return obj
t = obj["__type__"]
v = obj["value"]
if t == "datetime":
return datetime.fromisoformat(v)
if t == "decimal":
return Decimal(v)
return obj
original = {
"event": "purchase",
"timestamp": datetime(2024, 3, 15, 10, 0, 0),
"amount": Decimal("199.99"),
}
encoded = json.dumps(original, default=encode_extended)
restored = json.loads(encoded, object_hook=decode_extended)
print(restored["event"] == "purchase")
print(type(restored["timestamp"]) is datetime)
print(restored["timestamp"] == datetime(2024, 3, 15, 10, 0, 0))
print(restored["amount"] == Decimal("199.99"))
Output:
True
True
True
True
How it works: The encoder wraps special types into a tagged dict: {"__type__": "datetime", "value": "2024-03-15T10:00:00"}. The decoder checks every parsed JSON object for the __type__ key. If it finds one, it reconstructs the original Python type; if not, it returns the dict unchanged (which is why "event": "purchase" survives as-is). This pattern achieves true round-trip fidelity — the restored object is equal to the original.
Key insight: object_hook is called for every JSON object in the document, from the innermost objects outward. This is why the __type__ marker pattern works — the tagged dicts are processed by object_hook before the outer dict that contains them. The outer dict then receives the already-restored Python objects as values. Without this sentinel pattern, you cannot distinguish between a regular dict and one that represents a special type.
Expected Output
True\nTrue\nTrue\nTrueHints
Hint 1: object_hook is called on every JSON object (dict) after it is parsed.
Hint 2: Use a __type__ sentinel key to distinguish special objects from regular dicts.
Hint 3: datetime.fromisoformat() parses ISO 8601 strings back to datetime objects.
Hint 4: Decimal(string) reconstructs Decimal from a string without precision loss.
Predict the output. A safe parser catches JSONDecodeError and returns a fallback. Then verify the error attributes on known-bad inputs.
import json
def safe_parse(text):
try:
return json.loads(text)
except json.JSONDecodeError:
return "parse_error"
# Verify JSONDecodeError is a subclass of ValueError
print(issubclass(json.JSONDecodeError, ValueError))
# Check error attributes on invalid JSON
try:
json.loads("{'key': 'value'}") # single quotes — not valid JSON
except json.JSONDecodeError as e:
print(e.lineno == 1)
print(e.colno == 2)
# Common invalid inputs all return "parse_error"
print(safe_parse("{'key': 'value'}")) # single quotes
print(safe_parse('{"key": undefined}')) # undefined is JS, not JSON
print(safe_parse('{"x": 1,}')) # trailing commaSolution
import json
def safe_parse(text):
try:
return json.loads(text)
except json.JSONDecodeError:
return "parse_error"
print(issubclass(json.JSONDecodeError, ValueError))
try:
json.loads("{'key': 'value'}")
except json.JSONDecodeError as e:
print(e.lineno == 1)
print(e.colno == 2)
print(safe_parse("{'key': 'value'}"))
print(safe_parse('{"key": undefined}'))
print(safe_parse('{"x": 1,}'))
Output:
True
True
True
parse_error
parse_error
parse_error
How it works:
-
json.JSONDecodeErroris a subclass ofValueError— older code that catchesValueErrorwill still catch JSON parse errors. -
On
"{'key': 'value'}", the parser encounters a single quote'at line 1, column 2 (after the opening brace), where it expects a double-quoted property name. The.linenoand.colnoattributes let you report exactly where in a large document the parse failed. -
All three invalid inputs fail for different reasons: single quotes are not valid JSON string delimiters;
undefinedis a JavaScript-only value that does not exist in JSON; trailing commas after the last element are not allowed in JSON (unlike JavaScript/Python).
Key insight: In production, always catch json.JSONDecodeError (not Exception) when parsing external data — from APIs, message queues, or uploaded files. Log the .msg, .lineno, and a safe excerpt of .doc for debugging. Never log the full response body when it may contain PII or credentials. The json.JSONDecodeError attributes make it easy to build informative error messages without exposing sensitive data.
Expected Output
True\nTrue\nTrue\nparse_error\nparse_error\nparse_errorHints
Hint 1: json.JSONDecodeError is a subclass of ValueError.
Hint 2: It exposes .lineno, .colno, and .msg attributes describing the exact failure location.
Hint 3: Single quotes, trailing commas, and undefined are common causes — none are valid JSON.
Hard
Verify the behavior of ensure_ascii. Show that ASCII-escaped and direct-Unicode outputs are semantically identical, and that ensure_ascii=False produces smaller output for non-ASCII data.
import json
data = {
"greeting": "こんにちは", # Japanese: Hello
"currency": "€100",
"check": "✓",
}
ascii_out = json.dumps(data)
unicode_out = json.dumps(data, ensure_ascii=False)
# ensure_ascii=True escapes all non-ASCII to \uXXXX
print(r"\u3053" in ascii_out)
# ensure_ascii=False writes characters directly
print("こんにちは" in unicode_out)
# Both parse to the same Python dict
print(json.loads(ascii_out) == json.loads(unicode_out))
# Direct Unicode output is shorter for non-ASCII data
print(len(unicode_out) < len(ascii_out))Solution
import json
data = {
"greeting": "こんにちは",
"currency": "€100",
"check": "✓",
}
ascii_out = json.dumps(data)
unicode_out = json.dumps(data, ensure_ascii=False)
print(r"\u3053" in ascii_out)
print("こんにちは" in unicode_out)
print(json.loads(ascii_out) == json.loads(unicode_out))
print(len(unicode_out) < len(ascii_out))
Output:
True
True
True
True
How it works: With ensure_ascii=True (the default), every non-ASCII character is replaced with a \uXXXX escape sequence — 6 characters per Unicode code point. The Japanese character こ (U+3053) becomes the 6-character sequence \u3053. The full greeting こんにちは (5 characters) becomes 30 ASCII characters.
With ensure_ascii=False, the characters are written directly as UTF-8. Both outputs are valid JSON — any compliant JSON parser handles \uXXXX escapes and direct Unicode. Parsing either one produces the identical Python dict.
Key insight: Use ensure_ascii=False whenever you are working with multilingual data and writing to UTF-8 files or HTTP responses. The output is shorter, human-readable, and debuggable. The only reason to use ensure_ascii=True (the default) is if you need to guarantee the output contains only 7-bit ASCII characters — for example, embedding JSON in an ASCII-only context. Always pair ensure_ascii=False with encoding="utf-8" on the file handle.
Expected Output
True\nTrue\nTrue\nTrueHints
Hint 1: By default, json.dumps escapes all non-ASCII characters to \uXXXX sequences.
Hint 2: ensure_ascii=False writes Unicode characters directly — valid JSON, smaller output.
Hint 3: Both outputs parse to identical Python objects — they are semantically equivalent.
Hint 4: Always pair ensure_ascii=False with encoding="utf-8" when writing to a file.
Implement to_dict() and from_dict() on an Event class so that instances survive a full JSON round-trip with correct types restored.
import json
from datetime import datetime
class Event:
def __init__(self, name, occurred_at, severity):
self.name = name
self.occurred_at = occurred_at # datetime
self.severity = severity # int
def to_dict(self):
return {
"name": self.name,
"occurred_at": self.occurred_at.isoformat(),
"severity": self.severity,
}
@classmethod
def from_dict(cls, data):
return cls(
name=data["name"],
occurred_at=datetime.fromisoformat(data["occurred_at"]),
severity=data["severity"],
)
original = Event("deploy", datetime(2024, 5, 20, 9, 0, 0), 2)
json_str = json.dumps(original.to_dict())
restored = Event.from_dict(json.loads(json_str))
print(restored.name == "deploy")
print(type(restored.occurred_at) is datetime)
print(restored.occurred_at == datetime(2024, 5, 20, 9, 0, 0))
print(restored.severity == 2)Solution
import json
from datetime import datetime
class Event:
def __init__(self, name, occurred_at, severity):
self.name = name
self.occurred_at = occurred_at
self.severity = severity
def to_dict(self):
return {
"name": self.name,
"occurred_at": self.occurred_at.isoformat(),
"severity": self.severity,
}
@classmethod
def from_dict(cls, data):
return cls(
name=data["name"],
occurred_at=datetime.fromisoformat(data["occurred_at"]),
severity=data["severity"],
)
original = Event("deploy", datetime(2024, 5, 20, 9, 0, 0), 2)
json_str = json.dumps(original.to_dict())
restored = Event.from_dict(json.loads(json_str))
print(restored.name == "deploy")
print(type(restored.occurred_at) is datetime)
print(restored.occurred_at == datetime(2024, 5, 20, 9, 0, 0))
print(restored.severity == 2)
Output:
True
True
True
True
How it works: to_dict() converts the object to a plain Python dict that contains only JSON-serializable types. The datetime field is serialized as an ISO 8601 string. json.dumps(event.to_dict()) then serializes that dict without needing a custom encoder. On the way back, json.loads() produces a plain dict, and Event.from_dict() reconstructs the typed object — calling datetime.fromisoformat() to restore the datetime from its string form.
Key insight: The to_dict() / from_dict() pattern is the cleanest way to add serialization to domain objects. It is explicit, testable, and does not couple your objects to the JSON module. Compare to __dict__ serialization: that approach is simpler but does not handle typed fields (like datetime) and exposes all attributes including private ones. For dataclasses, the equivalent is dataclasses.asdict() for encoding and a custom from_dict for decoding.
Expected Output
True\nTrue\nTrue\nTrueHints
Hint 1: Add a to_dict() method that converts the object to a JSON-serializable dict.
Hint 2: Add a classmethod from_dict() that reconstructs the object from a dict.
Hint 3: datetime fields must be serialized as strings (isoformat) and deserialized back (fromisoformat).
Hint 4: The from_dict classmethod should reconstruct any typed fields from their serialized form.
Build a complete JSON pipeline with a custom encoder, a restoring decoder, and safe error handling. Verify a full round-trip and a graceful failure path.
import json
from datetime import datetime
from decimal import Decimal
import uuid
class PipelineEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return {"__type__": "datetime", "value": obj.isoformat()}
if isinstance(obj, Decimal):
return {"__type__": "decimal", "value": str(obj)}
if isinstance(obj, uuid.UUID):
return str(obj)
return super().default(obj)
def pipeline_decoder(obj):
if "__type__" not in obj:
return obj
t = obj["__type__"]
v = obj["value"]
if t == "datetime":
return datetime.fromisoformat(v)
if t == "decimal":
return Decimal(v)
return obj
def safe_serialize(data):
return json.dumps(data, cls=PipelineEncoder)
def safe_deserialize(text):
try:
return json.loads(text, object_hook=pipeline_decoder)
except json.JSONDecodeError:
return None
# Build a realistic payload
event_id = uuid.UUID("aaaabbbb-cccc-dddd-eeee-ffffaaaabbbb")
payload = {
"event_id": event_id,
"occurred_at": datetime(2024, 9, 1, 8, 30, 0),
"amount": Decimal("1250.75"),
"status": "processed",
}
encoded = safe_serialize(payload)
restored = safe_deserialize(encoded)
print(restored["status"] == "processed")
print(type(restored["occurred_at"]) is datetime)
print(restored["occurred_at"] == datetime(2024, 9, 1, 8, 30, 0))
print(restored["amount"] == Decimal("1250.75"))
# Verify graceful failure on malformed JSON
bad_result = safe_deserialize("not valid JSON {{{")
print(bad_result is None)Solution
import json
from datetime import datetime
from decimal import Decimal
import uuid
class PipelineEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return {"__type__": "datetime", "value": obj.isoformat()}
if isinstance(obj, Decimal):
return {"__type__": "decimal", "value": str(obj)}
if isinstance(obj, uuid.UUID):
return str(obj)
return super().default(obj)
def pipeline_decoder(obj):
if "__type__" not in obj:
return obj
t = obj["__type__"]
v = obj["value"]
if t == "datetime":
return datetime.fromisoformat(v)
if t == "decimal":
return Decimal(v)
return obj
def safe_serialize(data):
return json.dumps(data, cls=PipelineEncoder)
def safe_deserialize(text):
try:
return json.loads(text, object_hook=pipeline_decoder)
except json.JSONDecodeError:
return None
event_id = uuid.UUID("aaaabbbb-cccc-dddd-eeee-ffffaaaabbbb")
payload = {
"event_id": event_id,
"occurred_at": datetime(2024, 9, 1, 8, 30, 0),
"amount": Decimal("1250.75"),
"status": "processed",
}
encoded = safe_serialize(payload)
restored = safe_deserialize(encoded)
print(restored["status"] == "processed")
print(type(restored["occurred_at"]) is datetime)
print(restored["occurred_at"] == datetime(2024, 9, 1, 8, 30, 0))
print(restored["amount"] == Decimal("1250.75"))
bad_result = safe_deserialize("not valid JSON {{{")
print(bad_result is None)
Output:
True
True
True
True
True
How it works:
-
PipelineEncoder.default()handles three non-serializable types.datetimeandDecimalare wrapped in__type__marker dicts so they can be restored on deserialization.uuid.UUIDis converted to its string form directly — no restoration needed, since the UUID string is usable as-is in most systems. -
pipeline_decoder()acts as theobject_hook. Every parsed JSON object passes through it. It checks for the__type__marker and reconstructsdatetimeorDecimalaccordingly. Plain dicts (like the outer payload dict) pass through untouched. -
safe_deserialize()wrapsjson.loads()in a try/except. When"not valid JSON {{{"is passed,json.JSONDecodeErroris caught andNoneis returned — a clean signal to the caller that parsing failed.
Key insight: This pattern — paired encoder and decoder with __type__ markers, wrapped in safe serialization functions — is the foundation of a production JSON pipeline. It achieves four goals simultaneously: type safety (no precision loss for Decimal), round-trip fidelity (datetime survives serialization as a datetime, not a string), graceful degradation (malformed input returns None, not a crash), and centralization (all serialization logic lives in one place). Real frameworks like Django REST Framework and Pydantic use variants of this same pattern internally.
Expected Output
True\nTrue\nTrue\nTrue\nTrueHints
Hint 1: Combine a custom JSONEncoder subclass for encoding with an object_hook for decoding.
Hint 2: The encoder should handle datetime, Decimal, and uuid.UUID types.
Hint 3: The decoder should restore datetime and Decimal using the __type__ marker.
Hint 4: Wrap the load step in a try/except to handle malformed input gracefully.
Hint 5: A None return from the safe loader signals a parse failure to the caller.
