GitHub Portfolio
What makes an AI portfolio stand out -- READMEs, project selection, code quality.
Reading time: ~25 min | Interview relevance: High | Roles: All AI/ML roles
The Real Interview Moment
Picture this. You are forty minutes into a technical interview at a Series B AI startup. The conversation has gone well. You have answered questions about transformer architectures and distributed training. Then the interviewer says:
"Let's pull up your GitHub. Walk me through one of your projects."
She opens your profile. In the next ninety seconds, before you even start talking, she has already formed an opinion. She sees the pinned repositories, scans a README, glances at your contribution graph, and notices whether your last commit was three days ago or nine months ago.
This moment happens more often than you think. A 2023 Stack Overflow survey found that over 75% of hiring managers look at a candidate's GitHub at some point during the evaluation process. For AI/ML roles specifically, where the gap between "took an online course" and "can build production systems" is vast, your GitHub is the single most powerful signal you control outside of the interview room itself.
:::tip Your GitHub is a living portfolio Unlike a resume that gets a six-second scan, your GitHub gets explored. Interviewers click into repos, read code, check commit history, and examine READMEs. Every detail counts. :::
This guide covers everything you need to build a GitHub portfolio that makes interviewers want to hire you -- from selecting the right projects to writing READMEs that sell your work, from code quality signals to maintaining your portfolio over time.
What Recruiters and Interviewers Actually Look At
Not everyone evaluates GitHub the same way. Understanding what different evaluators focus on helps you prioritize.
Recruiter Screen (30 seconds)
Recruiters are non-technical or semi-technical. They look at surface signals:
| Signal | What They Notice |
|---|---|
| Profile photo and bio | Professional presence, relevant keywords |
| Pinned repositories | Titles and short descriptions |
| Contribution graph | Is it green? Recent activity? |
| Star counts | Social proof (even a few stars help) |
| README quality | Does the top repo look polished? |
Hiring Manager Review (2-5 minutes)
Hiring managers dig one level deeper:
| Signal | What They Notice |
|---|---|
| Project relevance | Do these repos match the job description? |
| README depth | Architecture, results, technical decisions |
| Code organization | Is the project structured like production code? |
| Commit history | Meaningful messages vs. "fix stuff" |
| Recency | Active in last 3 months? |
Technical Interviewer Deep Dive (10-30 minutes)
This is where it gets serious. Technical interviewers will:
- Read your code line by line in at least one file
- Check for testing -- any tests at all is a strong positive signal
- Look at your git history -- do you make small, logical commits?
- Examine your dependencies -- did you pick sensible libraries?
- Search for anti-patterns -- hardcoded secrets, no error handling, spaghetti imports
- Ask you to explain decisions -- "Why did you use FAISS instead of Pinecone here?"
:::warning The 90-second rule Research from technical recruiting firms suggests that most evaluators form a strong initial impression within 90 seconds of opening your GitHub profile. Your pinned repos and their READMEs are your storefront. Treat them accordingly. :::
The 3-Project Portfolio Strategy
You do not need twenty repositories. You need three excellent ones. Each serves a distinct purpose.
Project 1: ML Depth
This project demonstrates that you understand machine learning deeply. It goes beyond calling model.fit().
Characteristics:
- Custom model implementation or significant modification of existing architectures
- Rigorous evaluation with multiple metrics, baselines, and ablation studies
- Thoughtful data processing pipeline
- Clear documentation of experimental results
Examples:
- A fine-tuned LLM with custom evaluation harness and LoRA adapter analysis
- An object detection system with custom anchor box calculations and mAP evaluation
- A recommendation engine comparing collaborative filtering, content-based, and hybrid approaches
Project 2: Engineering Quality
This project demonstrates that you can build software, not just train models. It shows you can take ML from a notebook to a system.
Characteristics:
- Clean project structure with separation of concerns
- API or service layer (FastAPI, Flask, gRPC)
- Proper dependency management and Docker containerization
- CI/CD pipeline with tests
- Monitoring or logging infrastructure
Examples:
- A RAG system with a FastAPI backend, vector store, and evaluation pipeline
- A real-time inference service with batching, caching, and graceful degradation
- An ML pipeline orchestrated with Airflow or Prefect, including data validation
Project 3: Domain Interest
This project shows you care about a specific problem space. It signals genuine curiosity and the ability to go deep on a domain.
Characteristics:
- Solves a real problem (not a Kaggle competition rehash)
- Includes domain context in the README
- Shows you can acquire and work with non-trivial data
- Demonstrates end-to-end thinking from problem to solution
Examples:
- A clinical NLP system that extracts medical entities from discharge summaries
- A satellite imagery pipeline that detects deforestation patterns
- A financial document analyzer that parses SEC filings and extracts risk factors
:::tip The power of three Three pinned repos is the sweet spot. Fewer looks thin. More creates decision fatigue. Three gives the interviewer a clear narrative: "This person understands ML, writes production code, and cares about real problems." :::
Project Selection by Role
Different roles have different expectations. Tailor your three projects to the role you are targeting.
Machine Learning Engineer (MLE)
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | Custom training pipeline with distributed training or mixed-precision | Model architecture knowledge, training optimization |
| Engineering | Model serving microservice with A/B testing and monitoring | System design, latency awareness |
| Domain | End-to-end ML product (search, recommendation, fraud detection) | Business impact, full-stack ML |
AI Engineer
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | RAG system with custom chunking, retrieval evaluation, and re-ranking | LLM application architecture |
| Engineering | Agent framework with tool use, memory, and structured outputs | API integration, prompt engineering at scale |
| Domain | Production chatbot or copilot with guardrails and evaluation | User-facing AI, safety awareness |
MLOps Engineer
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | Feature store implementation with online/offline serving | Data engineering for ML |
| Engineering | End-to-end ML pipeline with CI/CD, model registry, and rollback | Infrastructure as code, automation |
| Domain | Model monitoring dashboard with drift detection and alerting | Observability, production ML |
Data Scientist
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | Causal inference study or A/B test analysis framework | Statistical rigor, experimental design |
| Engineering | Interactive dashboard with Streamlit/Gradio and automated reporting | Communication, stakeholder-facing tools |
| Domain | Deep analysis of a real dataset with actionable insights | Storytelling, domain expertise |
Research Engineer
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | Paper reproduction with ablation studies and extensions | Paper reading, implementation skill |
| Engineering | Experiment tracking framework with reproducible configs | Research infrastructure |
| Domain | Novel application of a recent technique to a new problem | Creativity, research taste |
Data Engineer
| Project Slot | What to Build | Key Signals |
|---|---|---|
| ML Depth | Feature engineering pipeline with real-time and batch paths | ML-aware data engineering |
| Engineering | Data lakehouse or streaming pipeline with quality checks | Distributed systems, data quality |
| Domain | ETL pipeline for a specific data domain (healthcare, finance, IoT) | Domain data expertise |
15+ Project Ideas That Stand Out
These are not tutorials. Each requires genuine problem-solving and produces a portfolio piece that interviewers remember.
LLM and AI Engineering
-
Multi-model routing system -- Build a service that routes prompts to different LLMs (GPT-4, Claude, Llama) based on complexity, cost, and latency constraints. Include a scoring mechanism and cost tracking dashboard.
-
RAG evaluation framework -- Create a comprehensive evaluation harness for RAG systems. Test chunking strategies, embedding models, retrieval methods, and generation quality. Publish results as a benchmark.
-
LLM-powered code reviewer -- Build a GitHub Action that uses an LLM to review pull requests, focusing on bugs, security issues, and style. Include structured output parsing and configurable rules.
-
Conversational agent with persistent memory -- Implement a chatbot with hierarchical memory (short-term buffer, long-term vector store, entity memory). Show how memory improves response quality over time.
Classical ML and Data Science
-
Real-time anomaly detection engine -- Stream processing pipeline (Kafka or Redis Streams) that detects anomalies in time-series data using multiple methods (isolation forest, autoencoders, statistical). Include a live dashboard.
-
Causal impact analyzer -- Tool that estimates the causal effect of interventions (marketing campaigns, feature launches) using difference-in-differences, synthetic control, and Bayesian structural time series.
-
AutoML pipeline with explainability -- Build an automated ML pipeline that not only finds the best model but generates SHAP explanations, partial dependence plots, and a human-readable report.
MLOps and Infrastructure
-
Model A/B testing platform -- Infrastructure for running A/B tests on ML models in production. Traffic splitting, metric collection, statistical significance testing, and automated rollback.
-
ML pipeline with data contracts -- End-to-end pipeline where each stage has explicit data contracts (schemas, quality checks, SLAs). Include automated alerting when contracts are violated.
-
GPU cluster scheduler -- A simplified job scheduler for ML training jobs on a GPU cluster. Implement priority queuing, preemption, and resource tracking.
Computer Vision
-
Document understanding pipeline -- OCR plus layout analysis plus information extraction from complex documents (invoices, research papers, forms). Include evaluation on a custom dataset.
-
Video anomaly detection -- System that processes surveillance or dashcam video to detect unusual events. Include temporal modeling and a review interface.
NLP and Information Retrieval
-
Multi-language semantic search -- Search engine that works across languages using multilingual embeddings. Include evaluation with NDCG/MRR metrics and a query analysis tool.
-
Structured data extraction from unstructured text -- Pipeline that extracts entities, relations, and events from news articles or scientific papers into a knowledge graph. Include a graph visualization.
Full-Stack AI
-
AI-powered data labeling tool -- A labeling interface where an ML model provides suggestions, humans correct them, and the model improves through active learning. Track annotation speed and model accuracy over time.
-
Personalized content recommender -- Recommendation system with a web UI, real-time feature computation, and A/B testing framework. Show how recommendations improve with more user interaction data.
-
Intelligent document Q&A system -- Upload PDFs and ask questions. But go beyond basic RAG: implement table extraction, figure understanding, cross-document reasoning, and citation with page numbers.
:::danger Avoid these common project choices
- Titanic/Iris/MNIST classifiers -- Every beginner has these. They show nothing.
- Tutorial follow-alongs -- If your code matches a YouTube tutorial line for line, interviewers will notice.
- Kaggle competition notebooks -- These optimize for leaderboard position, not engineering quality. If you must include one, rewrite it as a proper project with clean code and a real README.
- "Awesome" list repos -- Curating links is not building software. :::
The Anatomy of a Strong README
Your README is the most important file in your repository. It is the landing page, the pitch, and the documentation all in one.
Section 1: The Hook
The first three lines determine whether someone keeps reading.
# Semantic Router: Intelligent LLM Request Routing
Route LLM prompts to the optimal model based on complexity, cost, and latency.
Reduces API costs by 40% while maintaining response quality within 2% of GPT-4.

What makes this work:
- Clear name that describes what it does
- One-sentence summary of the value proposition
- A quantified result that makes people pay attention
- A visual (GIF, screenshot, or diagram) immediately
Section 2: Architecture Diagram
Show how the system fits together. This signals that you think in systems, not just scripts.
:::tip Use Mermaid or ASCII GitHub renders Mermaid diagrams natively. If you prefer portability, ASCII diagrams work everywhere. Either way, include a visual representation of your system architecture. :::
Section 3: Results and Demo
Show, do not tell. This section proves your project works.
## Results
### Routing Accuracy
| Model | Accuracy | Avg Latency | Monthly Cost (10K req) |
|---------------|----------|-------------|----------------------|
| Always GPT-4 | 94.2% | 2.3s | $450 |
| Always GPT-3.5 | 78.1\% | 0.8s | $45 |
| **Our Router** | **92.8%**| **1.1s** | **$180** |
### Live Demo
Try it: [semantic-router-demo.railway.app](https://example.com)
### Screenshots


Section 4: Quick Start
Make it trivially easy to run your project. Friction kills interest.
## Quick Start
### Prerequisites
- Python 3.10+
- Docker (optional, for containerized deployment)
### Installation
```bash
git clone https://github.com/yourname/semantic-router.git
cd semantic-router
pip install -e ".[dev]"
Run
# Start the API server
uvicorn src.api.main:app --reload
# Or use Docker
docker compose up
Configuration
Copy the example environment file and add your API keys:
cp .env.example .env
# Edit .env with your LLM API keys
### Section 5: Technical Decisions and Tradeoffs
This is the section that separates portfolio projects from professional work. Interviewers love reading your reasoning.
```markdown
## Technical Decisions
### Why DistilBERT for complexity classification?
We need sub-50ms classification latency to avoid adding overhead to the
routing decision. DistilBERT achieves 97\% of BERT's accuracy on our
complexity dataset while running 2.5x faster. We considered a simple
regex-based heuristic but found it missed nuanced cases (e.g., simple
questions about complex topics).
### Why not use embeddings for routing?
We tested cosine similarity against a bank of "complex" vs "simple"
example prompts. It worked for obvious cases but failed on edge cases
where topic complexity differs from linguistic complexity. A trained
classifier gives us more control over the decision boundary.
### Cost optimization strategy
We use a two-phase approach:
1. **Classification phase**: Determine prompt complexity (simple/medium/complex)
2. **Optimization phase**: Given the complexity tier, select the cheapest
model that meets the latency SLA
This decoupling lets us update pricing without retraining the classifier.
Section 6: Project Structure
Show that your code is organized.
Complete README Template
Here is a full template you can adapt:
# Project Name: One-Line Description
Brief paragraph (2-3 sentences) explaining what this does, why it
matters, and the key result or metric.

## Highlights
- Bullet point: key feature or result with a number
- Bullet point: technology choice that matters
- Bullet point: something that makes this unique
## Architecture
[System diagram here]
## Results
[Table of metrics, screenshots, or link to live demo]
## Quick Start
### Prerequisites
[Minimal list]
### Installation
[3 commands or fewer]
### Run
[1-2 commands]
## Technical Decisions
### Decision 1: Why X over Y?
[2-3 sentences explaining reasoning and tradeoffs]
### Decision 2: How we handle Z
[2-3 sentences]
## Project Structure
[Tree diagram]
## Development
### Testing
```bash
pytest tests/ -v
Linting
ruff check src/
mypy src/
Future Work
- Planned improvement 1
- Planned improvement 2
License
MIT
## Code Quality Signals
Your README gets people in the door. Your code determines whether they stay.
### Project Structure
Interviewers pattern-match against professional projects. Use a structure they recognize.

:::warning Keep notebooks out of src/
Notebooks are fine for exploration, but core logic should live in `.py` files. If your entire project is a single Jupyter notebook, interviewers will assume you cannot write production code. Extract functions and classes into modules and import them in notebooks.
:::
### Clean Code and Type Hints
Write code that reads like documentation.
**Bad:**
```python
def process(d, t=0.5):
r = []
for i in d:
s = model.predict(i['text'])
if s > t:
r.append({'id': i['id'], 'score': s, 'label': 'positive'})
else:
r.append({'id': i['id'], 'score': s, 'label': 'negative'})
return r
Good:
from dataclasses import dataclass
@dataclass
class SentimentResult:
"""Result of sentiment analysis for a single document."""
document_id: str
score: float
label: str
def classify_sentiment(
documents: list[dict[str, str]],
threshold: float = 0.5,
) -> list[SentimentResult]:
"""Classify sentiment for a batch of documents.
Args:
documents: List of dicts with 'id' and 'text' keys.
threshold: Score above which a document is classified as positive.
Returns:
List of SentimentResult objects with scores and labels.
"""
results = []
for doc in documents:
score = model.predict(doc["text"])
label = "positive" if score > threshold else "negative"
results.append(
SentimentResult(
document_id=doc["id"],
score=score,
label=label,
)
)
return results
Key signals interviewers look for:
- Type hints on function signatures
- Docstrings that explain parameters, return values, and behavior
- Meaningful variable names (not
d,r,s,i) - Dataclasses or Pydantic models instead of raw dicts
- Single responsibility -- each function does one thing
Testing
Having any tests at all puts you ahead of 80% of portfolio projects. You do not need 100% coverage, but show that you know how to test ML code.
# tests/unit/test_metrics.py
import pytest
from src.evaluation.metrics import precision_at_k, ndcg_at_k
class TestPrecisionAtK:
"""Tests for precision@k metric."""
def test_perfect_ranking(self):
"""All relevant items ranked first should give precision of 1.0."""
relevant = {1, 2, 3}
ranked = [1, 2, 3, 4, 5]
assert precision_at_k(ranked, relevant, k=3) == 1.0
def test_no_relevant_items(self):
"""No relevant items in top-k should give precision of 0.0."""
relevant = {6, 7, 8}
ranked = [1, 2, 3, 4, 5]
assert precision_at_k(ranked, relevant, k=3) == 0.0
def test_partial_relevant(self):
"""Two out of three relevant should give precision of 2/3."""
relevant = {1, 3}
ranked = [1, 2, 3, 4, 5]
assert precision_at_k(ranked, relevant, k=3) == pytest.approx(2 / 3)
def test_k_larger_than_list(self):
"""When k exceeds list length, use available items."""
relevant = {1, 2}
ranked = [1, 2]
assert precision_at_k(ranked, relevant, k=5) == 1.0
class TestNDCGAtK:
"""Tests for NDCG@k metric."""
def test_perfect_ranking_gives_ndcg_one(self):
relevance_scores = {1: 3, 2: 2, 3: 1}
ranked = [1, 2, 3]
assert ndcg_at_k(ranked, relevance_scores, k=3) == pytest.approx(1.0)
def test_reversed_ranking(self):
relevance_scores = {1: 3, 2: 2, 3: 1}
ranked = [3, 2, 1]
result = ndcg_at_k(ranked, relevance_scores, k=3)
assert result < 1.0
assert result > 0.0
What to test in ML projects:
| Component | What to Test | Example |
|---|---|---|
| Data processing | Input/output shapes, edge cases, null handling | test_tokenizer_handles_empty_string |
| Metrics | Known inputs produce expected outputs | test_f1_score_with_perfect_predictions |
| Model I/O | Input/output dimensions, dtype correctness | test_model_output_shape_matches_num_classes |
| API endpoints | Request/response contracts, error handling | test_predict_endpoint_returns_valid_json |
| Config loading | Default values, validation, override behavior | test_config_raises_on_missing_api_key |
CI/CD with GitHub Actions
A .github/workflows/ci.yml file shows you understand software engineering beyond your local machine.
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
pip install -e ".[dev]"
- name: Lint
run: |
ruff check src/ tests/
ruff format --check src/ tests/
- name: Type check
run: |
mypy src/
- name: Test
run: |
pytest tests/ -v --tb=short
:::tip Start with ruff
If you add only one tool, make it ruff. It handles both linting and formatting, is extremely fast, and replaces flake8 + isort + black. Add it to your CI pipeline and run it locally with a pre-commit hook.
:::
Dependency Management
Use pyproject.toml for modern Python projects. It consolidates your project metadata, dependencies, and tool configuration in one file.
# pyproject.toml
[project]
name = "semantic-router"
version = "0.1.0"
description = "Intelligent LLM request routing"
requires-python = ">=3.10"
dependencies = [
"fastapi>=0.104.0",
"uvicorn>=0.24.0",
"transformers>=4.36.0",
"torch>=2.1.0",
"pydantic>=2.5.0",
]
[project.optional-dependencies]
dev = [
"pytest>=7.4.0",
"ruff>=0.1.0",
"mypy>=1.7.0",
"pre-commit>=3.6.0",
]
[tool.ruff]
target-version = "py310"
line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B"]
[tool.mypy]
python_version = "3.10"
strict = true
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-v --tb=short"
Avoid these dependency anti-patterns:
requirements.txtwith unpinned versions (torch, nottorch>=2.1.0)- No lockfile or reproducibility mechanism
- Dependencies installed globally instead of in a virtual environment
- Mixing
pip,conda, andpoetryin the same project without explanation
Profile README
The profile README is a special repository named after your GitHub username (e.g., yourname/yourname). It appears at the top of your profile page and is your chance to make a first impression.
What to Include
# Hi, I'm [Your Name]
ML Engineer focused on LLM systems and real-time inference.
Currently building [current project or role].
## What I'm Working On
- **[Semantic Router](link)** -- Intelligent LLM request routing
that reduces API costs by 40%
- **[ML Pipeline Framework](link)** -- Production ML pipelines
with data contracts and automated monitoring
- **[Clinical NLP](link)** -- Entity extraction from medical
documents using fine-tuned transformer models
## Skills
**ML/AI**: PyTorch, Transformers, LangChain, RAG, Fine-tuning
**Engineering**: FastAPI, Docker, Kubernetes, GitHub Actions
**Data**: PostgreSQL, Redis, Pinecone, Spark
## Writing
I write about ML engineering at [your blog/newsletter].
Recent posts:
- [How We Reduced LLM Latency by 60%](link)
- [A Practical Guide to RAG Evaluation](link)
## Connect
What NOT to Include
- Animated GIFs that slow down page load
- Walls of badges that say nothing meaningful
- "Visitor count" widgets
- Long lists of every technology you have ever touched
- Auto-generated GitHub stats cards (they look the same for everyone)
:::tip Keep it scannable Your profile README should be readable in under 15 seconds. Three pinned projects, a one-line bio, and a few links. That is all you need. :::
Contribution Graph and Open Source
The Green Graph
Your contribution graph is a heatmap of your coding activity. While it should not be gamed, consistency matters.
What a good contribution graph signals:
- You code regularly, not just in bursts before job applications
- You are actively building and learning
- You maintain your projects after the initial push
What evaluators understand:
- Private repo contributions still show as green squares
- Gaps are normal (vacations, day jobs, life)
- Intensity matters less than consistency
Open Source Contributions
Contributing to established open source projects is the strongest signal on GitHub. Even small contributions count.
High-impact contribution types:
| Contribution Type | Signal | Difficulty |
|---|---|---|
| Bug fix with test | You can read unfamiliar code and improve it | Medium |
| Documentation improvement | You care about user experience | Low |
| New feature (accepted) | You can work within existing architectures | High |
| Issue with reproduction steps | You can debug systematically | Low |
| Review comments on PRs | You can evaluate others' code | Medium |
Where to contribute for AI/ML:
- Hugging Face Transformers -- Always has "good first issue" tags
- LangChain / LlamaIndex -- Fast-moving, many contribution opportunities
- scikit-learn -- High bar but extremely impressive on a resume
- FastAPI -- Great if you build ML APIs
- PyTorch / TensorFlow -- Even documentation fixes are notable
- MLflow / DVC / Weights & Biases -- MLOps tooling is underserved
:::tip The documentation shortcut Documentation contributions are underrated. Fixing unclear docs in a major project like PyTorch or scikit-learn shows you understand the library deeply enough to explain it better. Start here if open source feels intimidating. :::
What NOT to Put on GitHub
Your GitHub should be curated, not comprehensive. Remove or archive anything that weakens your portfolio.
Remove or Archive Immediately
Tutorial follow-alongs:
Repos named udemy-python-course or fastai-lesson-3 tell interviewers you can follow instructions, not solve problems. Delete them or make them private.
Half-finished projects: A repo with three commits, no README, and a last update from 18 months ago is worse than no repo at all. It signals you do not finish what you start. Either complete it or archive it.
Messy Jupyter notebooks:
A single notebook with 200 cells, no markdown headers, outputs left in, and variable names like df2_final_v3 is a red flag. If you have exploratory notebooks, clean them up or keep them private.
Forked repos you never modified: Forking a popular repo and never committing to it clutters your profile. Remove forks you are not actively contributing to.
Repos with credentials or API keys:
Even if you have since rotated the keys, a repo with hardcoded secrets in the git history signals poor security practices. Use .env files, .gitignore, and environment variables.
Make Private (But Keep)
- Course assignments (you might reference them later)
- Personal configuration dotfiles (unless they are exceptionally well-organized)
- Experimental scratch repos where you test ideas
- Work-related code that should not be public
:::danger Audit your existing repos now Before your next job application, go through every public repository on your profile. For each one, ask: "If an interviewer clicked on this, would it help or hurt my chances?" Be ruthless. Archive anything that does not help. :::
Maintaining Your Portfolio
A portfolio is not a one-time project. It needs ongoing maintenance to stay effective.
The Monthly Review (30 minutes)
Set a monthly calendar reminder to review your GitHub:
- Update pinned repos -- Are they still your best work? Should you swap one?
- Check for staleness -- Any repo with no commits in 6+ months looks abandoned. Either add improvements or archive it.
- Review READMEs -- Are links still working? Are screenshots current? Is the "Quick Start" still accurate?
- Clean up -- Archive completed experiments. Delete branches that were merged. Close stale issues.
Keeping Projects Fresh
You do not need to build new projects constantly. Improving existing ones is often more impressive.
Ways to refresh a project without starting over:
| Improvement | Time Investment | Impact |
|---|---|---|
| Add a test suite | 2-4 hours | High |
| Set up CI/CD | 1-2 hours | High |
| Improve the README with results | 1-2 hours | High |
| Add type hints throughout | 2-3 hours | Medium |
| Migrate to pyproject.toml | 30 minutes | Medium |
| Add Docker support | 1-2 hours | Medium |
| Write a blog post about the project | 3-5 hours | High |
| Add a Makefile for common commands | 30 minutes | Low |
| Respond to issues (even your own) | 15 minutes | Low |
The Version Strategy
When you significantly improve a project, consider tagging a release. GitHub releases show up on your profile and signal active development.
git tag -a v1.0.0 -m "Initial release with core routing functionality"
git push origin v1.0.0
Then create a GitHub Release with release notes summarizing what the project does and key metrics.
Portfolio Review Checklist
Use this checklist before applying to jobs. Every item you check off strengthens your portfolio.
Profile Level
- Professional profile photo and display name
- Bio includes current focus area and target role keywords
- Profile README exists and is concise
- Three repos are pinned, each serving a distinct purpose (ML depth, engineering, domain)
- Contribution graph shows activity in the last 3 months
- No embarrassing public repos (tutorials, messy code, credentials)
Per-Repository (For Each Pinned Repo)
README:
- Hook: clear title, one-line description, key metric or result
- Architecture diagram (ASCII, Mermaid, or image)
- Results section with metrics, screenshots, or demo link
- Quick Start that works in three commands or fewer
- Technical decisions section explaining at least two tradeoffs
- Project structure overview
Code Quality:
- Organized project structure (src/, tests/, configs/)
- Type hints on all function signatures
- Docstrings on public functions and classes
- No hardcoded secrets, paths, or credentials
- Meaningful variable and function names
- Consistent code style (enforced by a linter)
Engineering Practices:
- At least 5 meaningful tests (unit or integration)
- CI/CD pipeline (GitHub Actions) that runs linting and tests
- pyproject.toml or requirements.txt with pinned versions
- Dockerfile or docker-compose.yml for reproducibility
- Makefile or similar for common commands
- .gitignore that excludes data files, caches, and environment files
Git Hygiene:
- Meaningful commit messages (not "fix" or "update")
- Logical commits (one change per commit, not massive dumps)
- No large binary files in git history
- Main branch is stable and passing CI
Interview Readiness
- You can explain every technical decision in each pinned repo
- You can discuss tradeoffs you considered and why you chose your approach
- You can identify at least two things you would improve in each project
- You have a 2-minute verbal walkthrough prepared for each project
- You can draw the architecture diagram from memory
Putting It All Together
Your GitHub portfolio is the one artifact in your job search that compounds over time. A strong resume gets you a phone screen. A strong GitHub gets you through the technical evaluation.
The formula is straightforward:
- Choose three projects that map to the role you want
- Build them with production-quality code -- structure, types, tests, CI
- Write READMEs that sell your work -- hook, architecture, results, decisions
- Maintain them -- monthly reviews, incremental improvements, fresh commits
- Curate ruthlessly -- archive anything that does not strengthen your narrative
Start today. Pick one of the project ideas from this guide, create the repository, set up the project structure, and write the README before you write a single line of model code. The README-first approach forces you to think about what you are building and why before you get lost in implementation details.
Your future interviewer is going to open your GitHub. Make sure what they find makes them want to work with you.
Key Takeaways
| Principle | Action |
|---|---|
| Quality over quantity | 3 excellent repos beat 20 mediocre ones |
| README is your landing page | Write it first, update it often |
| Code like a professional | Types, tests, CI, clean structure |
| Tailor to your target role | MLE, AI Eng, MLOps all want different signals |
| Curate ruthlessly | Archive or delete anything that weakens your profile |
| Maintain consistently | Monthly reviews, incremental improvements |
| Prepare to discuss | Every line of code is fair game in an interview |
