GitHub Portfolio

What makes an AI portfolio stand out -- READMEs, project selection, code quality.

Reading time: ~25 min | Interview relevance: High | Roles: All AI/ML roles

The Real Interview Moment

Picture this. You are forty minutes into a technical interview at a Series B AI startup. The conversation has gone well. You have answered questions about transformer architectures and distributed training. Then the interviewer says:

"Let's pull up your GitHub. Walk me through one of your projects."

She opens your profile. In the next ninety seconds, before you even start talking, she has already formed an opinion. She sees the pinned repositories, scans a README, glances at your contribution graph, and notices whether your last commit was three days ago or nine months ago.

This moment happens more often than you think. A 2023 Stack Overflow survey found that over 75% of hiring managers look at a candidate's GitHub at some point during the evaluation process. For AI/ML roles specifically, where the gap between "took an online course" and "can build production systems" is vast, your GitHub is the single most powerful signal you control outside of the interview room itself.

:::tip Your GitHub is a living portfolio Unlike a resume that gets a six-second scan, your GitHub gets explored. Interviewers click into repos, read code, check commit history, and examine READMEs. Every detail counts. :::

This guide covers everything you need to build a GitHub portfolio that makes interviewers want to hire you -- from selecting the right projects to writing READMEs that sell your work, from code quality signals to maintaining your portfolio over time.

What Recruiters and Interviewers Actually Look At

Not everyone evaluates GitHub the same way. Understanding what different evaluators focus on helps you prioritize.

Recruiter Screen (30 seconds)

Recruiters are non-technical or semi-technical. They look at surface signals:

Signal	What They Notice
Profile photo and bio	Professional presence, relevant keywords
Pinned repositories	Titles and short descriptions
Contribution graph	Is it green? Recent activity?
Star counts	Social proof (even a few stars help)
README quality	Does the top repo look polished?

Hiring Manager Review (2-5 minutes)

Hiring managers dig one level deeper:

Signal	What They Notice
Project relevance	Do these repos match the job description?
README depth	Architecture, results, technical decisions
Code organization	Is the project structured like production code?
Commit history	Meaningful messages vs. "fix stuff"
Recency	Active in last 3 months?

Technical Interviewer Deep Dive (10-30 minutes)

This is where it gets serious. Technical interviewers will:

Read your code line by line in at least one file
Check for testing -- any tests at all is a strong positive signal
Look at your git history -- do you make small, logical commits?
Examine your dependencies -- did you pick sensible libraries?
Search for anti-patterns -- hardcoded secrets, no error handling, spaghetti imports
Ask you to explain decisions -- "Why did you use FAISS instead of Pinecone here?"

:::warning The 90-second rule Research from technical recruiting firms suggests that most evaluators form a strong initial impression within 90 seconds of opening your GitHub profile. Your pinned repos and their READMEs are your storefront. Treat them accordingly. :::

The 3-Project Portfolio Strategy

You do not need twenty repositories. You need three excellent ones. Each serves a distinct purpose.

Project 1: ML Depth

This project demonstrates that you understand machine learning deeply. It goes beyond calling model.fit().

Characteristics:

Custom model implementation or significant modification of existing architectures
Rigorous evaluation with multiple metrics, baselines, and ablation studies
Thoughtful data processing pipeline
Clear documentation of experimental results

Examples:

A fine-tuned LLM with custom evaluation harness and LoRA adapter analysis
An object detection system with custom anchor box calculations and mAP evaluation
A recommendation engine comparing collaborative filtering, content-based, and hybrid approaches

Project 2: Engineering Quality

This project demonstrates that you can build software, not just train models. It shows you can take ML from a notebook to a system.

Characteristics:

Clean project structure with separation of concerns
API or service layer (FastAPI, Flask, gRPC)
Proper dependency management and Docker containerization
CI/CD pipeline with tests
Monitoring or logging infrastructure

Examples:

A RAG system with a FastAPI backend, vector store, and evaluation pipeline
A real-time inference service with batching, caching, and graceful degradation
An ML pipeline orchestrated with Airflow or Prefect, including data validation

Project 3: Domain Interest

This project shows you care about a specific problem space. It signals genuine curiosity and the ability to go deep on a domain.

Characteristics:

Solves a real problem (not a Kaggle competition rehash)
Includes domain context in the README
Shows you can acquire and work with non-trivial data
Demonstrates end-to-end thinking from problem to solution

Examples:

A clinical NLP system that extracts medical entities from discharge summaries
A satellite imagery pipeline that detects deforestation patterns
A financial document analyzer that parses SEC filings and extracts risk factors

:::tip The power of three Three pinned repos is the sweet spot. Fewer looks thin. More creates decision fatigue. Three gives the interviewer a clear narrative: "This person understands ML, writes production code, and cares about real problems." :::

Project Selection by Role

Different roles have different expectations. Tailor your three projects to the role you are targeting.

Machine Learning Engineer (MLE)

Project Slot	What to Build	Key Signals
ML Depth	Custom training pipeline with distributed training or mixed-precision	Model architecture knowledge, training optimization
Engineering	Model serving microservice with A/B testing and monitoring	System design, latency awareness
Domain	End-to-end ML product (search, recommendation, fraud detection)	Business impact, full-stack ML

AI Engineer

Project Slot	What to Build	Key Signals
ML Depth	RAG system with custom chunking, retrieval evaluation, and re-ranking	LLM application architecture
Engineering	Agent framework with tool use, memory, and structured outputs	API integration, prompt engineering at scale
Domain	Production chatbot or copilot with guardrails and evaluation	User-facing AI, safety awareness

MLOps Engineer

Project Slot	What to Build	Key Signals
ML Depth	Feature store implementation with online/offline serving	Data engineering for ML
Engineering	End-to-end ML pipeline with CI/CD, model registry, and rollback	Infrastructure as code, automation
Domain	Model monitoring dashboard with drift detection and alerting	Observability, production ML

Data Scientist

Project Slot	What to Build	Key Signals
ML Depth	Causal inference study or A/B test analysis framework	Statistical rigor, experimental design
Engineering	Interactive dashboard with Streamlit/Gradio and automated reporting	Communication, stakeholder-facing tools
Domain	Deep analysis of a real dataset with actionable insights	Storytelling, domain expertise

Research Engineer

Project Slot	What to Build	Key Signals
ML Depth	Paper reproduction with ablation studies and extensions	Paper reading, implementation skill
Engineering	Experiment tracking framework with reproducible configs	Research infrastructure
Domain	Novel application of a recent technique to a new problem	Creativity, research taste

Data Engineer

Project Slot	What to Build	Key Signals
ML Depth	Feature engineering pipeline with real-time and batch paths	ML-aware data engineering
Engineering	Data lakehouse or streaming pipeline with quality checks	Distributed systems, data quality
Domain	ETL pipeline for a specific data domain (healthcare, finance, IoT)	Domain data expertise

15+ Project Ideas That Stand Out

These are not tutorials. Each requires genuine problem-solving and produces a portfolio piece that interviewers remember.

LLM and AI Engineering

Multi-model routing system -- Build a service that routes prompts to different LLMs (GPT-4, Claude, Llama) based on complexity, cost, and latency constraints. Include a scoring mechanism and cost tracking dashboard.
RAG evaluation framework -- Create a comprehensive evaluation harness for RAG systems. Test chunking strategies, embedding models, retrieval methods, and generation quality. Publish results as a benchmark.
LLM-powered code reviewer -- Build a GitHub Action that uses an LLM to review pull requests, focusing on bugs, security issues, and style. Include structured output parsing and configurable rules.
Conversational agent with persistent memory -- Implement a chatbot with hierarchical memory (short-term buffer, long-term vector store, entity memory). Show how memory improves response quality over time.

Classical ML and Data Science

Real-time anomaly detection engine -- Stream processing pipeline (Kafka or Redis Streams) that detects anomalies in time-series data using multiple methods (isolation forest, autoencoders, statistical). Include a live dashboard.
Causal impact analyzer -- Tool that estimates the causal effect of interventions (marketing campaigns, feature launches) using difference-in-differences, synthetic control, and Bayesian structural time series.
AutoML pipeline with explainability -- Build an automated ML pipeline that not only finds the best model but generates SHAP explanations, partial dependence plots, and a human-readable report.

MLOps and Infrastructure

Model A/B testing platform -- Infrastructure for running A/B tests on ML models in production. Traffic splitting, metric collection, statistical significance testing, and automated rollback.
ML pipeline with data contracts -- End-to-end pipeline where each stage has explicit data contracts (schemas, quality checks, SLAs). Include automated alerting when contracts are violated.
GPU cluster scheduler -- A simplified job scheduler for ML training jobs on a GPU cluster. Implement priority queuing, preemption, and resource tracking.

Computer Vision

Document understanding pipeline -- OCR plus layout analysis plus information extraction from complex documents (invoices, research papers, forms). Include evaluation on a custom dataset.
Video anomaly detection -- System that processes surveillance or dashcam video to detect unusual events. Include temporal modeling and a review interface.

NLP and Information Retrieval

Multi-language semantic search -- Search engine that works across languages using multilingual embeddings. Include evaluation with NDCG/MRR metrics and a query analysis tool.
Structured data extraction from unstructured text -- Pipeline that extracts entities, relations, and events from news articles or scientific papers into a knowledge graph. Include a graph visualization.

Full-Stack AI

AI-powered data labeling tool -- A labeling interface where an ML model provides suggestions, humans correct them, and the model improves through active learning. Track annotation speed and model accuracy over time.
Personalized content recommender -- Recommendation system with a web UI, real-time feature computation, and A/B testing framework. Show how recommendations improve with more user interaction data.
Intelligent document Q&A system -- Upload PDFs and ask questions. But go beyond basic RAG: implement table extraction, figure understanding, cross-document reasoning, and citation with page numbers.

:::danger Avoid these common project choices

Titanic/Iris/MNIST classifiers -- Every beginner has these. They show nothing.
Tutorial follow-alongs -- If your code matches a YouTube tutorial line for line, interviewers will notice.
Kaggle competition notebooks -- These optimize for leaderboard position, not engineering quality. If you must include one, rewrite it as a proper project with clean code and a real README.
"Awesome" list repos -- Curating links is not building software. :::

The Anatomy of a Strong README

Your README is the most important file in your repository. It is the landing page, the pitch, and the documentation all in one.

Section 1: The Hook

The first three lines determine whether someone keeps reading.

# Semantic Router: Intelligent LLM Request Routing

Route LLM prompts to the optimal model based on complexity, cost, and latency.
Reduces API costs by 40% while maintaining response quality within 2% of GPT-4.

![Demo](docs/demo.gif)

What makes this work:

Clear name that describes what it does
One-sentence summary of the value proposition
A quantified result that makes people pay attention
A visual (GIF, screenshot, or diagram) immediately

Section 2: Architecture Diagram

Show how the system fits together. This signals that you think in systems, not just scripts.

Semantic Router Architecture

:::tip Use Mermaid or ASCII GitHub renders Mermaid diagrams natively. If you prefer portability, ASCII diagrams work everywhere. Either way, include a visual representation of your system architecture. :::

Section 3: Results and Demo

Show, do not tell. This section proves your project works.

## Results

### Routing Accuracy
| Model          | Accuracy | Avg Latency | Monthly Cost (10K req) |
|---------------|----------|-------------|----------------------|
| Always GPT-4   | 94.2%    | 2.3s        | $450                 |
| Always GPT-3.5 | 78.1\%    | 0.8s        | $45                  |
| **Our Router** | **92.8%**| **1.1s**    | **$180**             |

### Live Demo
Try it: [semantic-router-demo.railway.app](https://example.com)

### Screenshots
![Dashboard](docs/screenshots/dashboard.png)
![Analytics](docs/screenshots/analytics.png)

Section 4: Quick Start

Make it trivially easy to run your project. Friction kills interest.

## Quick Start

### Prerequisites
- Python 3.10+
- Docker (optional, for containerized deployment)

### Installation

```bash
git clone https://github.com/yourname/semantic-router.git
cd semantic-router
pip install -e ".[dev]"

Run

# Start the API server
uvicorn src.api.main:app --reload

# Or use Docker
docker compose up

Configuration

Copy the example environment file and add your API keys:

cp .env.example .env
# Edit .env with your LLM API keys

### Section 5: Technical Decisions and Tradeoffs

This is the section that separates portfolio projects from professional work. Interviewers love reading your reasoning.

```markdown
## Technical Decisions

### Why DistilBERT for complexity classification?
We need sub-50ms classification latency to avoid adding overhead to the
routing decision. DistilBERT achieves 97\% of BERT's accuracy on our
complexity dataset while running 2.5x faster. We considered a simple
regex-based heuristic but found it missed nuanced cases (e.g., simple
questions about complex topics).

### Why not use embeddings for routing?
We tested cosine similarity against a bank of "complex" vs "simple"
example prompts. It worked for obvious cases but failed on edge cases
where topic complexity differs from linguistic complexity. A trained
classifier gives us more control over the decision boundary.

### Cost optimization strategy
We use a two-phase approach:
1. **Classification phase**: Determine prompt complexity (simple/medium/complex)
2. **Optimization phase**: Given the complexity tier, select the cheapest
   model that meets the latency SLA

This decoupling lets us update pricing without retraining the classifier.

Section 6: Project Structure

Show that your code is organized.

Semantic Router Project Structure

Complete README Template

Here is a full template you can adapt:

# Project Name: One-Line Description

Brief paragraph (2-3 sentences) explaining what this does, why it
matters, and the key result or metric.

![Demo or Architecture Diagram](docs/hero-image.png)

## Highlights

- Bullet point: key feature or result with a number
- Bullet point: technology choice that matters
- Bullet point: something that makes this unique

## Architecture

[System diagram here]

## Results

[Table of metrics, screenshots, or link to live demo]

## Quick Start

### Prerequisites
[Minimal list]

### Installation
[3 commands or fewer]

### Run
[1-2 commands]

## Technical Decisions

### Decision 1: Why X over Y?
[2-3 sentences explaining reasoning and tradeoffs]

### Decision 2: How we handle Z
[2-3 sentences]

## Project Structure

[Tree diagram]

## Development

### Testing
```bash
pytest tests/ -v

Linting

ruff check src/
mypy src/

Future Work

Planned improvement 1
Planned improvement 2

License

MIT

## Code Quality Signals

Your README gets people in the door. Your code determines whether they stay.

### Project Structure

Interviewers pattern-match against professional projects. Use a structure they recognize.

![ML Project Structure](/img/diagrams/break-into-ai/03-resume-portfolio/ml-project-structure.svg)

:::warning Keep notebooks out of src/
Notebooks are fine for exploration, but core logic should live in `.py` files. If your entire project is a single Jupyter notebook, interviewers will assume you cannot write production code. Extract functions and classes into modules and import them in notebooks.
:::

### Clean Code and Type Hints

Write code that reads like documentation.

**Bad:**

```python
def process(d, t=0.5):
    r = []
    for i in d:
        s = model.predict(i['text'])
        if s > t:
            r.append({'id': i['id'], 'score': s, 'label': 'positive'})
        else:
            r.append({'id': i['id'], 'score': s, 'label': 'negative'})
    return r

Good:

from dataclasses import dataclass


@dataclass
class SentimentResult:
    """Result of sentiment analysis for a single document."""
    document_id: str
    score: float
    label: str


def classify_sentiment(
    documents: list[dict[str, str]],
    threshold: float = 0.5,
) -> list[SentimentResult]:
    """Classify sentiment for a batch of documents.

    Args:
        documents: List of dicts with 'id' and 'text' keys.
        threshold: Score above which a document is classified as positive.

    Returns:
        List of SentimentResult objects with scores and labels.
    """
    results = []
    for doc in documents:
        score = model.predict(doc["text"])
        label = "positive" if score > threshold else "negative"
        results.append(
            SentimentResult(
                document_id=doc["id"],
                score=score,
                label=label,
            )
        )
    return results

Key signals interviewers look for:

Type hints on function signatures
Docstrings that explain parameters, return values, and behavior
Meaningful variable names (not d, r, s, i)
Dataclasses or Pydantic models instead of raw dicts
Single responsibility -- each function does one thing

Testing

Having any tests at all puts you ahead of 80% of portfolio projects. You do not need 100% coverage, but show that you know how to test ML code.

# tests/unit/test_metrics.py
import pytest
from src.evaluation.metrics import precision_at_k, ndcg_at_k


class TestPrecisionAtK:
    """Tests for precision@k metric."""

    def test_perfect_ranking(self):
        """All relevant items ranked first should give precision of 1.0."""
        relevant = {1, 2, 3}
        ranked = [1, 2, 3, 4, 5]
        assert precision_at_k(ranked, relevant, k=3) == 1.0

    def test_no_relevant_items(self):
        """No relevant items in top-k should give precision of 0.0."""
        relevant = {6, 7, 8}
        ranked = [1, 2, 3, 4, 5]
        assert precision_at_k(ranked, relevant, k=3) == 0.0

    def test_partial_relevant(self):
        """Two out of three relevant should give precision of 2/3."""
        relevant = {1, 3}
        ranked = [1, 2, 3, 4, 5]
        assert precision_at_k(ranked, relevant, k=3) == pytest.approx(2 / 3)

    def test_k_larger_than_list(self):
        """When k exceeds list length, use available items."""
        relevant = {1, 2}
        ranked = [1, 2]
        assert precision_at_k(ranked, relevant, k=5) == 1.0


class TestNDCGAtK:
    """Tests for NDCG@k metric."""

    def test_perfect_ranking_gives_ndcg_one(self):
        relevance_scores = {1: 3, 2: 2, 3: 1}
        ranked = [1, 2, 3]
        assert ndcg_at_k(ranked, relevance_scores, k=3) == pytest.approx(1.0)

    def test_reversed_ranking(self):
        relevance_scores = {1: 3, 2: 2, 3: 1}
        ranked = [3, 2, 1]
        result = ndcg_at_k(ranked, relevance_scores, k=3)
        assert result < 1.0
        assert result > 0.0

What to test in ML projects:

Component	What to Test	Example
Data processing	Input/output shapes, edge cases, null handling	`test_tokenizer_handles_empty_string`
Metrics	Known inputs produce expected outputs	`test_f1_score_with_perfect_predictions`
Model I/O	Input/output dimensions, dtype correctness	`test_model_output_shape_matches_num_classes`
API endpoints	Request/response contracts, error handling	`test_predict_endpoint_returns_valid_json`
Config loading	Default values, validation, override behavior	`test_config_raises_on_missing_api_key`

CI/CD with GitHub Actions

A .github/workflows/ci.yml file shows you understand software engineering beyond your local machine.

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.11", "3.12"]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install dependencies
        run: |
          pip install -e ".[dev]"

      - name: Lint
        run: |
          ruff check src/ tests/
          ruff format --check src/ tests/

      - name: Type check
        run: |
          mypy src/

      - name: Test
        run: |
          pytest tests/ -v --tb=short

:::tip Start with ruff If you add only one tool, make it ruff. It handles both linting and formatting, is extremely fast, and replaces flake8 + isort + black. Add it to your CI pipeline and run it locally with a pre-commit hook. :::

Dependency Management

Use pyproject.toml for modern Python projects. It consolidates your project metadata, dependencies, and tool configuration in one file.

# pyproject.toml
[project]
name = "semantic-router"
version = "0.1.0"
description = "Intelligent LLM request routing"
requires-python = ">=3.10"
dependencies = [
    "fastapi>=0.104.0",
    "uvicorn>=0.24.0",
    "transformers>=4.36.0",
    "torch>=2.1.0",
    "pydantic>=2.5.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=7.4.0",
    "ruff>=0.1.0",
    "mypy>=1.7.0",
    "pre-commit>=3.6.0",
]

[tool.ruff]
target-version = "py310"
line-length = 88

[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B"]

[tool.mypy]
python_version = "3.10"
strict = true

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-v --tb=short"

Avoid these dependency anti-patterns:

requirements.txt with unpinned versions (torch, not torch>=2.1.0)
No lockfile or reproducibility mechanism
Dependencies installed globally instead of in a virtual environment
Mixing pip, conda, and poetry in the same project without explanation

Profile README

The profile README is a special repository named after your GitHub username (e.g., yourname/yourname). It appears at the top of your profile page and is your chance to make a first impression.

What to Include

# Hi, I'm [Your Name]

ML Engineer focused on LLM systems and real-time inference.
Currently building [current project or role].

## What I'm Working On

- **[Semantic Router](link)** -- Intelligent LLM request routing
  that reduces API costs by 40%
- **[ML Pipeline Framework](link)** -- Production ML pipelines
  with data contracts and automated monitoring
- **[Clinical NLP](link)** -- Entity extraction from medical
  documents using fine-tuned transformer models

## Skills

**ML/AI**: PyTorch, Transformers, LangChain, RAG, Fine-tuning
**Engineering**: FastAPI, Docker, Kubernetes, GitHub Actions
**Data**: PostgreSQL, Redis, Pinecone, Spark

## Writing

I write about ML engineering at [your blog/newsletter].
Recent posts:
- [How We Reduced LLM Latency by 60%](link)
- [A Practical Guide to RAG Evaluation](link)

## Connect

[LinkedIn](link) | [Twitter](link) | [Email](mailto:[email protected])

What NOT to Include

Animated GIFs that slow down page load
Walls of badges that say nothing meaningful
"Visitor count" widgets
Long lists of every technology you have ever touched
Auto-generated GitHub stats cards (they look the same for everyone)

:::tip Keep it scannable Your profile README should be readable in under 15 seconds. Three pinned projects, a one-line bio, and a few links. That is all you need. :::

Contribution Graph and Open Source

The Green Graph

Your contribution graph is a heatmap of your coding activity. While it should not be gamed, consistency matters.

What a good contribution graph signals:

You code regularly, not just in bursts before job applications
You are actively building and learning
You maintain your projects after the initial push

What evaluators understand:

Private repo contributions still show as green squares
Gaps are normal (vacations, day jobs, life)
Intensity matters less than consistency

Open Source Contributions

Contributing to established open source projects is the strongest signal on GitHub. Even small contributions count.

High-impact contribution types:

Contribution Type	Signal	Difficulty
Bug fix with test	You can read unfamiliar code and improve it	Medium
Documentation improvement	You care about user experience	Low
New feature (accepted)	You can work within existing architectures	High
Issue with reproduction steps	You can debug systematically	Low
Review comments on PRs	You can evaluate others' code	Medium

Where to contribute for AI/ML:

Hugging Face Transformers -- Always has "good first issue" tags
LangChain / LlamaIndex -- Fast-moving, many contribution opportunities
scikit-learn -- High bar but extremely impressive on a resume
FastAPI -- Great if you build ML APIs
PyTorch / TensorFlow -- Even documentation fixes are notable
MLflow / DVC / Weights & Biases -- MLOps tooling is underserved

:::tip The documentation shortcut Documentation contributions are underrated. Fixing unclear docs in a major project like PyTorch or scikit-learn shows you understand the library deeply enough to explain it better. Start here if open source feels intimidating. :::

What NOT to Put on GitHub

Your GitHub should be curated, not comprehensive. Remove or archive anything that weakens your portfolio.

Remove or Archive Immediately

Tutorial follow-alongs: Repos named udemy-python-course or fastai-lesson-3 tell interviewers you can follow instructions, not solve problems. Delete them or make them private.

Half-finished projects: A repo with three commits, no README, and a last update from 18 months ago is worse than no repo at all. It signals you do not finish what you start. Either complete it or archive it.

Messy Jupyter notebooks: A single notebook with 200 cells, no markdown headers, outputs left in, and variable names like df2_final_v3 is a red flag. If you have exploratory notebooks, clean them up or keep them private.

Forked repos you never modified: Forking a popular repo and never committing to it clutters your profile. Remove forks you are not actively contributing to.

Repos with credentials or API keys: Even if you have since rotated the keys, a repo with hardcoded secrets in the git history signals poor security practices. Use .env files, .gitignore, and environment variables.

Make Private (But Keep)

Course assignments (you might reference them later)
Personal configuration dotfiles (unless they are exceptionally well-organized)
Experimental scratch repos where you test ideas
Work-related code that should not be public

:::danger Audit your existing repos now Before your next job application, go through every public repository on your profile. For each one, ask: "If an interviewer clicked on this, would it help or hurt my chances?" Be ruthless. Archive anything that does not help. :::

Maintaining Your Portfolio

A portfolio is not a one-time project. It needs ongoing maintenance to stay effective.

The Monthly Review (30 minutes)

Set a monthly calendar reminder to review your GitHub:

Update pinned repos -- Are they still your best work? Should you swap one?
Check for staleness -- Any repo with no commits in 6+ months looks abandoned. Either add improvements or archive it.
Review READMEs -- Are links still working? Are screenshots current? Is the "Quick Start" still accurate?
Clean up -- Archive completed experiments. Delete branches that were merged. Close stale issues.

Keeping Projects Fresh

You do not need to build new projects constantly. Improving existing ones is often more impressive.

Ways to refresh a project without starting over:

Improvement	Time Investment	Impact
Add a test suite	2-4 hours	High
Set up CI/CD	1-2 hours	High
Improve the README with results	1-2 hours	High
Add type hints throughout	2-3 hours	Medium
Migrate to pyproject.toml	30 minutes	Medium
Add Docker support	1-2 hours	Medium
Write a blog post about the project	3-5 hours	High
Add a Makefile for common commands	30 minutes	Low
Respond to issues (even your own)	15 minutes	Low

The Version Strategy

When you significantly improve a project, consider tagging a release. GitHub releases show up on your profile and signal active development.

git tag -a v1.0.0 -m "Initial release with core routing functionality"
git push origin v1.0.0

Then create a GitHub Release with release notes summarizing what the project does and key metrics.

Portfolio Review Checklist

Use this checklist before applying to jobs. Every item you check off strengthens your portfolio.

Profile Level

Professional profile photo and display name
Bio includes current focus area and target role keywords
Profile README exists and is concise
Three repos are pinned, each serving a distinct purpose (ML depth, engineering, domain)
Contribution graph shows activity in the last 3 months
No embarrassing public repos (tutorials, messy code, credentials)

Per-Repository (For Each Pinned Repo)

README:

Hook: clear title, one-line description, key metric or result
Architecture diagram (ASCII, Mermaid, or image)
Results section with metrics, screenshots, or demo link
Quick Start that works in three commands or fewer
Technical decisions section explaining at least two tradeoffs
Project structure overview

Code Quality:

Organized project structure (src/, tests/, configs/)
Type hints on all function signatures
Docstrings on public functions and classes
No hardcoded secrets, paths, or credentials
Meaningful variable and function names
Consistent code style (enforced by a linter)

Engineering Practices:

At least 5 meaningful tests (unit or integration)
CI/CD pipeline (GitHub Actions) that runs linting and tests
pyproject.toml or requirements.txt with pinned versions
Dockerfile or docker-compose.yml for reproducibility
Makefile or similar for common commands
.gitignore that excludes data files, caches, and environment files

Git Hygiene:

Meaningful commit messages (not "fix" or "update")
Logical commits (one change per commit, not massive dumps)
No large binary files in git history
Main branch is stable and passing CI

Interview Readiness

You can explain every technical decision in each pinned repo
You can discuss tradeoffs you considered and why you chose your approach
You can identify at least two things you would improve in each project
You have a 2-minute verbal walkthrough prepared for each project
You can draw the architecture diagram from memory

Putting It All Together

Your GitHub portfolio is the one artifact in your job search that compounds over time. A strong resume gets you a phone screen. A strong GitHub gets you through the technical evaluation.

The formula is straightforward:

Choose three projects that map to the role you want
Build them with production-quality code -- structure, types, tests, CI
Write READMEs that sell your work -- hook, architecture, results, decisions
Maintain them -- monthly reviews, incremental improvements, fresh commits
Curate ruthlessly -- archive anything that does not strengthen your narrative

Start today. Pick one of the project ideas from this guide, create the repository, set up the project structure, and write the README before you write a single line of model code. The README-first approach forces you to think about what you are building and why before you get lost in implementation details.

Your future interviewer is going to open your GitHub. Make sure what they find makes them want to work with you.

Key Takeaways

Principle	Action
Quality over quantity	3 excellent repos beat 20 mediocre ones
README is your landing page	Write it first, update it often
Code like a professional	Types, tests, CI, clean structure
Tailor to your target role	MLE, AI Eng, MLOps all want different signals
Curate ruthlessly	Archive or delete anything that weakens your profile
Maintain consistently	Monthly reviews, incremental improvements
Prepare to discuss	Every line of code is fair game in an interview

The Real Interview Moment​

What Recruiters and Interviewers Actually Look At​

Recruiter Screen (30 seconds)​

Hiring Manager Review (2-5 minutes)​

Technical Interviewer Deep Dive (10-30 minutes)​

The 3-Project Portfolio Strategy​

Project 1: ML Depth​

Project 2: Engineering Quality​

Project 3: Domain Interest​

Project Selection by Role​

Machine Learning Engineer (MLE)​

AI Engineer​

MLOps Engineer​

Data Scientist​

Research Engineer​

Data Engineer​

15+ Project Ideas That Stand Out​

LLM and AI Engineering​

Classical ML and Data Science​

MLOps and Infrastructure​

Computer Vision​

NLP and Information Retrieval​

Full-Stack AI​

The Anatomy of a Strong README​

Section 1: The Hook​

Section 2: Architecture Diagram​

Section 3: Results and Demo​

Section 4: Quick Start​

Run​

Configuration​

Section 6: Project Structure​

Complete README Template​

Linting​

Future Work​

License​

Testing​

CI/CD with GitHub Actions​

Dependency Management​

Profile README​

What to Include​

What NOT to Include​

Contribution Graph and Open Source​

The Green Graph​

Open Source Contributions​

What NOT to Put on GitHub​

Remove or Archive Immediately​

Make Private (But Keep)​

Maintaining Your Portfolio​

The Monthly Review (30 minutes)​

Keeping Projects Fresh​

The Version Strategy​

Portfolio Review Checklist​

Profile Level​

Per-Repository (For Each Pinned Repo)​

Interview Readiness​

Putting It All Together​

Key Takeaways​

The Real Interview Moment

What Recruiters and Interviewers Actually Look At

Recruiter Screen (30 seconds)

Hiring Manager Review (2-5 minutes)

Technical Interviewer Deep Dive (10-30 minutes)

The 3-Project Portfolio Strategy

Project 1: ML Depth

Project 2: Engineering Quality

Project 3: Domain Interest

Project Selection by Role

Machine Learning Engineer (MLE)

AI Engineer

MLOps Engineer

Data Scientist

Research Engineer

Data Engineer

15+ Project Ideas That Stand Out

LLM and AI Engineering

Classical ML and Data Science

MLOps and Infrastructure

Computer Vision

NLP and Information Retrieval

Full-Stack AI

The Anatomy of a Strong README

Section 1: The Hook

Section 2: Architecture Diagram

Section 3: Results and Demo

Section 4: Quick Start

Run

Configuration

Section 6: Project Structure

Complete README Template

Linting

Future Work

License

Testing

CI/CD with GitHub Actions

Dependency Management

Profile README

What to Include

What NOT to Include

Contribution Graph and Open Source

The Green Graph

Open Source Contributions

What NOT to Put on GitHub

Remove or Archive Immediately

Make Private (But Keep)

Maintaining Your Portfolio

The Monthly Review (30 minutes)

Keeping Projects Fresh

The Version Strategy

Portfolio Review Checklist

Profile Level

Per-Repository (For Each Pinned Repo)

Interview Readiness

Putting It All Together

Key Takeaways