This project is due on Wednesday, May 6, 2026 before 11:59PM.
Final Presentations: April 28-29 during class time. A Review Board of senior clinical and technical leaders will attend to provide feedback—simulating a real deployment review.
Overview
The final project is the capstone of this course. You’ll build a complete clinical AI system and write a comprehensive field guide that could enable someone else to deploy and monitor it safely.
This project integrates everything you’ve learned:
- Medical data handling (imaging, text, or structured)
- Machine learning or deep learning modeling
- Evaluation with clinically meaningful metrics
- Governance, monitoring, and documentation
Deliverables:
- Technical artifact (working model/pipeline)
- Comprehensive field guide (6-10 pages)
- Final presentation (15-20 minutes)
Project Tracks
Choose ONE track. You may continue from your midterm project (different track) or start fresh.
Track 1: Medical Imaging
Build an image classification, segmentation, or detection system.
Example Projects:
- Skin lesion classification with uncertainty quantification
- Organ segmentation from CT with automated QA
- Chest X-ray multi-label classification with explainability
- Retinal image screening with referral recommendations
Must Include:
- Deep learning model (CNN, U-Net, or similar)
- Transfer learning or appropriate training strategy
- Grad-CAM or other interpretability method
- Subgroup analysis (e.g., by image source, patient demographics if available)
Track 2: Clinical NLP
Build a system that extracts, classifies, or summarizes clinical text.
Example Projects:
- Named entity extraction + UMLS linking pipeline
- Clinical note classification (e.g., identify high-risk patients)
- Discharge summary auto-generation/summarization
- Medication extraction and interaction checking
Must Include:
- Either traditional NLP (scispaCy, etc.) or LLM-based approach
- Comparison of at least two methods
- Evaluation on held-out test set
- Error analysis with clinical implications
Track 3: Structured Data Prediction
Build a predictive model from tabular clinical data.
Example Projects:
- 30-day readmission prediction
- Sepsis early warning (like HW7 scenario, but with real model)
- Length of stay prediction
- Treatment response prediction
Must Include:
- Multiple model comparison (e.g., logistic regression vs. XGBoost vs. neural net)
- SHAP or similar interpretability
- Calibration analysis
- Fairness analysis across demographic groups
Track 4: Custom Project (Requires Approval)
Have a different idea? Propose it!
To get approval:
- Submit a 1-page proposal by April 8 including:
- Problem statement and clinical motivation
- Data source and access plan
- Proposed methods
- How it relates to course themes
- Meet with instructor to discuss
The Field Guide (40% of grade)
The field guide is the heart of this project. It should be a document that a busy community clinic could use to decide whether to adopt your model and how to operate it safely.
Required Sections
- Executive Summary (0.5 page)
- What does this tool do?
- Who should use it and when?
- Key performance metrics (1-2 sentences)
- Clinical Context (1 page)
- What problem does this solve?
- Current standard of care
- How would this tool change clinical workflow?
- Technical Description (1-2 pages)
- Data requirements
- Model architecture (high-level)
- Input/output specification
- Computational requirements
- Performance Evaluation (1-2 pages)
- Metrics with confidence intervals
- Subgroup performance (if applicable)
- Comparison to baseline or alternative approaches
- Known failure modes
- Governance Framework (1-2 pages)
- Acceptance testing protocol for new sites
- Monitoring plan (what to track, thresholds, cadence)
- Escalation protocol (who to contact, when to pause)
- Human-in-the-loop requirements
- Limitations & Risks (0.5-1 page)
- Known limitations
- Populations where performance may degrade
- Potential for misuse
- What this tool should NOT be used for
- Appendix (as needed)
- Detailed results tables
- Additional visualizations
- Technical implementation notes
Technical Artifact (35% of grade)
Your code should be:
- Reproducible — Someone else can run it
- Documented — README explains how to use it
- Organized — Clear structure, not a single giant notebook
Repository Structure
final-project/
├── README.md # How to run everything
├── requirements.txt # Dependencies
├── data/ # Data or instructions to obtain it
│ └── README.md
├── src/ # Source code
│ ├── data_loading.py
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
├── notebooks/ # Exploration, visualization
│ └── analysis.ipynb
├── outputs/ # Results, figures
│ └── figures/
├── field_guide.pdf # Your field guide
└── slides.pdf # Presentation slides
Presentation (25% of grade)
15-20 minute presentation + 5 minutes Q&A.
The Review Board
Your final presentation simulates a real clinical AI deployment review. A Review Board of senior leaders will attend, including:
- Clinical informaticists
- Department leadership
- Technical experts from industry/Penn
They will ask questions as if you were proposing to deploy this tool at their institution. This is intentional—bridging technical depth with clinical context and leadership communication is the whole point.
Structure:
- The Problem (3 min) — What clinical problem? Why does it matter?
- The Data (2 min) — What data? Key characteristics and limitations?
- The Approach (4 min) — What did you build? Key technical decisions?
- Results (4 min) — How well does it work? Honest assessment.
- Field Guide Highlights (3 min) — Governance, monitoring, limitations
- Lessons Learned (2 min) — What would you do differently?
- Q&A with Review Board (5 min)
Presentations are April 28-29 during class.
Grading Rubric
Technical Artifact (35 points)
| Component |
Points |
| Data pipeline (loading, preprocessing, splits) |
7 |
| Model implementation (appropriate for task) |
8 |
| Training/evaluation code (runs, documented) |
7 |
| Results (metrics, visualizations) |
7 |
| Code quality (readable, organized, reproducible) |
6 |
Field Guide (40 points)
| Section |
Points |
| Executive summary |
4 |
| Clinical context |
6 |
| Technical description |
6 |
| Performance evaluation |
8 |
| Governance framework |
10 |
| Limitations & risks |
6 |
Presentation (25 points)
| Component |
Points |
| Clarity and organization |
8 |
| Technical depth |
7 |
| Honest assessment of limitations |
5 |
| Q&A responses |
5 |
Timeline
| Date |
Milestone |
| Apr 1 |
Project released, start planning |
| Apr 8 |
Custom project proposals due |
| Apr 15 |
CHECKPOINT: Field Guide Outline Due (see below) |
| Apr 22 |
Recommended: Main experiments complete |
| Apr 28-29 |
Final Presentations |
| May 6 |
Everything due by 11:59 PM |
Required Checkpoint: Field Guide Outline (Apr 15)
Submit a 1-page outline of your field guide via Canvas. This is required (pass/fail, no points) but ensures you’re thinking about governance early, not at the last minute.
Your outline should include:
- Project title and track
- One-sentence problem statement
- Intended clinical use (who uses it, when, for what)
- Bullet points for each field guide section (what you plan to cover)
- Known unknowns (what you’re still figuring out)
This checkpoint exists because last year’s best projects were the ones that started the field guide early. Don’t treat documentation as an afterthought.
FAQ
Can I work in a team?
Yes, teams of 2 are allowed. Teams of 2 are expected to do proportionally more work. Indicate contributions in your README.
Can I use my own data from research?
Yes, with approval. Make sure you can share enough for us to evaluate your work.
What if my model doesn’t work well?
That’s okay! Analyze why. A thoughtful analysis of failure is more valuable than inflated metrics.
How much should the field guide assume about the reader?
Assume the reader is a clinician with basic technical literacy — they know what “sensitivity” means but don’t know PyTorch.
Resources
Tips
- Start from HW7 — Your governance artifacts are a template for the field guide
- The field guide is not an afterthought — Budget significant time for it
- Get feedback early — Share drafts with classmates, come to office hours
- Be honest about limitations — We’re grading your analysis, not your AUC
- Practice your presentation — 15 minutes goes fast