01Module 03: Computer Use AgentsHow AI agents see, understand, and interact with graphical interfaces - browsers, desktops, and GUIs - using vision models and action executors.02Computer Use ArchitectureHow Anthropic's Computer Use API works - the screenshot-action loop, the three tools, coordinate systems, and building a working computer use agent with Docker.03Browser AgentsBuilding practical browser agents using Playwright and LLMs - DOM manipulation, visual navigation, session management, anti-bot handling, and complete Python implementation.04GUI Automation with VisionVision-based GUI automation for desktop applications - coordinate grounding, UI element detection, OCR integration, state tracking, and building a desktop automation agent.05Web Scraping AgentsAgent-based web scraping - handling dynamic JavaScript rendering, login flows, multi-page pagination, structured data extraction, and anti-detection techniques.06Safety and SandboxingSafety architecture for computer use agents - threat models, prompt injection, Docker sandboxing, action confirmation gates, logging, and anomaly detection.07Benchmarks: WebArena and OSWorldUnderstanding computer use agent benchmarks - WebArena, OSWorld, ScreenSpot, Mind2Web. Current SOTA results, what the numbers mean, and how to evaluate your own agent.