Skip to main content

One doc tagged with "transformers"

View all tags

Vision Transformers (ViT)

How Vision Transformers apply self-attention to image patches - architecture, patch embeddings, positional encoding, DeiT, Swin Transformer, fine-tuning strategies, and production trade-offs against CNNs.