This project is due on Wednesday, May 6, 2026 before 11:59PM.
Final Presentations: Wednesday, May 6 (afternoon) at the RadOnc.AI Lab. A Review Panel of invited clinical and technical guests will attend to provide feedback, giving you a taste of a real deployment review. Strict 12-minute talk + 3-minute Q&A — hard cutoff. See Presentation for details.
Overview
The final project is the capstone of this course. You’ll build a complete clinical AI system and write a comprehensive field guide that could enable someone else to deploy and monitor it safely.
This project integrates everything you’ve learned:
- Medical data handling (imaging, text, or structured)
- Machine learning or deep learning modeling
- Evaluation with clinically meaningful metrics
- Governance, monitoring, and documentation
Deliverables:
- Technical artifact (working model/pipeline)
- Comprehensive field guide (6-10 pages)
- Final presentation (12 minutes + 3 min Q&A, hard cutoff)
Project Tracks
Choose ONE track. You may continue from your midterm project (different track) or start fresh.
Track 1: Medical Imaging
Build an image classification, segmentation, or detection system.
Example Projects:
- Skin lesion classification with uncertainty quantification
- Organ segmentation from CT with automated QA
- Chest X-ray multi-label classification with explainability
- Retinal image screening with referral recommendations
Must Include:
- Deep learning model (CNN, U-Net, or similar)
- Transfer learning or appropriate training strategy
- Grad-CAM or other interpretability method
- Subgroup analysis (e.g., by image source, patient demographics if available)
Track 2: Clinical NLP
Build a system that extracts, classifies, or summarizes clinical text.
Example Projects:
- Named entity extraction + UMLS linking pipeline
- Clinical note classification (e.g., identify high-risk patients)
- Discharge summary auto-generation/summarization
- Medication extraction and interaction checking
Must Include:
- Either traditional NLP (scispaCy, etc.) or LLM-based approach
- Comparison of at least two methods
- Evaluation on held-out test set
- Error analysis with clinical implications
Track 3: Structured Data Prediction
Build a predictive model from tabular clinical data.
Example Projects:
- 30-day readmission prediction
- Sepsis early warning (like HW7 scenario, but with real model)
- Length of stay prediction
- Treatment response prediction
Must Include:
- Multiple model comparison (e.g., logistic regression vs. XGBoost vs. neural net)
- SHAP or similar interpretability
- Calibration analysis
- Fairness analysis across demographic groups
Track 4: Custom Project (Requires Approval)
Have a different idea? Propose it!
To get approval:
- Submit a 1-page proposal by April 8 including:
- Problem statement and clinical motivation
- Data source and access plan
- Proposed methods
- How it relates to course themes
- Meet with instructor to discuss
The Field Guide (40% of grade)
The field guide is the heart of this project. It should be a document that a busy community clinic could use to decide whether to adopt your model and how to operate it safely.
Required Sections
- Executive Summary (0.5 page)
- What does this tool do?
- Who should use it and when?
- Key performance metrics (1-2 sentences)
- Clinical Context (1 page)
- What problem does this solve?
- Current standard of care
- How would this tool change clinical workflow?
- Technical Description (1-2 pages)
- Data requirements
- Model architecture (high-level)
- Input/output specification
- Computational requirements
- Performance Evaluation (1-2 pages)
- Metrics with confidence intervals
- Subgroup performance (if applicable)
- Comparison to baseline or alternative approaches
- Known failure modes
- Governance Framework (1-2 pages)
- Acceptance testing protocol for new sites
- Monitoring plan (what to track, thresholds, cadence)
- Escalation protocol (who to contact, when to pause)
- Human-in-the-loop requirements
- Limitations & Risks (0.5-1 page)
- Known limitations
- Populations where performance may degrade
- Potential for misuse
- What this tool should NOT be used for
- Appendix (as needed)
- Detailed results tables
- Additional visualizations
- Technical implementation notes
Technical Artifact (35% of grade)
Your code should be:
- Reproducible — Someone else can run it
- Documented — README explains how to use it
- Organized — Clear structure, not a single giant notebook
Repository Structure
final-project/
├── README.md # How to run everything
├── requirements.txt # Dependencies
├── data/ # Data or instructions to obtain it
│ └── README.md
├── src/ # Source code
│ ├── data_loading.py
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
├── notebooks/ # Exploration, visualization
│ └── analysis.ipynb
├── outputs/ # Results, figures
│ └── figures/
├── field_guide.pdf # Your field guide
└── slides.pdf # Presentation slides
Presentation (25% of grade)
12-minute talk + 3 minutes Q&A. Hard cutoff. Presentations are Wednesday, May 6 in the afternoon at the RadOnc.AI Lab.
⏱ Strict Time Limits — New This Year
Last year’s presentations consistently ran long and ate into later groups’ time. This year the cutoff is strict, and this is by design: a 12-minute slot is the standard at real clinical AI venues (MICCAI, RSNA, AMIA). Learning to land the headline in 12 minutes is part of the field guide mindset.
- A visible countdown timer will be projected for every talk.
- Verbal cues at 11:00 (“1 min”), 11:30 (“30 sec”), and 12:00 (“time”).
- After 12:00 you will be cut off mid-slide.
- Dock: 1 point (out of 25) per 30 seconds over, rounded up. Going 45 seconds over costs 2 points.
- Your repo, field guide, and report carry the depth. The talk is the headline — point the Review Panel to the appendix for anything that doesn’t fit.
The Review Panel
Your final presentation gives you a taste of a real clinical AI deployment review. A Review Panel of invited guests will attend, which may include:
- Clinical informaticists
- Department colleagues
- Technical collaborators from Penn and partner groups
They’ll ask questions as if you were proposing to deploy this tool at their institution. This is intentional—bridging technical depth with clinical context and clear communication is the whole point.
Suggested Structure (12 min total)
- The Problem (2 min) — What clinical problem? Why does it matter?
- The Data & Approach (3 min) — What data, what did you build, key technical decisions?
- Results (3 min) — How well does it work? Honest assessment.
- Field Guide Highlights (2 min) — Governance, monitoring, limitations
- Lessons Learned & Close (2 min) — What would you do differently?
- Q&A with Review Panel (3 min)
Cut methods detail aggressively — put the walkthrough in your repo/report. Reviewers will read the field guide; they need you to tell them the story, not the spec.
Grading Rubric
Technical Artifact (35 points)
| Component |
Points |
| Data pipeline (loading, preprocessing, splits) |
7 |
| Model implementation (appropriate for task) |
8 |
| Training/evaluation code (runs, documented) |
7 |
| Results (metrics, visualizations) |
7 |
| Code quality (readable, organized, reproducible) |
6 |
Field Guide (40 points)
| Section |
Points |
| Executive summary |
4 |
| Clinical context |
6 |
| Technical description |
6 |
| Performance evaluation |
8 |
| Governance framework |
10 |
| Limitations & risks |
6 |
Presentation (25 points)
| Component |
Points |
| Clarity and organization |
8 |
| Technical depth |
7 |
| Honest assessment of limitations |
5 |
| Q&A responses |
5 |
Timeline
| Date |
Milestone |
| Apr 1 |
Project released, start planning |
| Apr 8 |
Custom project proposals due |
| Apr 15 |
CHECKPOINT: Field Guide Outline Due (see below) |
| Apr 22 |
Recommended: Main experiments complete |
| Apr 28 |
Final project prep day — in-class group work + instructor check-ins |
| Apr 29 |
Final project prep day — last in-class working session |
| May 6 (PM) |
Final Presentations @ RadOnc.AI Lab (12 + 3, hard cutoff) |
| May 6, 11:59 PM |
Everything due (technical artifact + field guide) |
Required Checkpoint: Field Guide Outline (Apr 15)
Submit a 1-page outline of your field guide via Canvas. This is required (pass/fail, no points) but ensures you’re thinking about governance early, not at the last minute.
Your outline should include:
- Project title and track
- One-sentence problem statement
- Intended clinical use (who uses it, when, for what)
- Bullet points for each field guide section (what you plan to cover)
- Known unknowns (what you’re still figuring out)
This checkpoint exists because last year’s best projects were the ones that started the field guide early. Don’t treat documentation as an afterthought.
FAQ
Can I work in a team?
Yes, teams of 2 are allowed. Teams of 2 are expected to do proportionally more work. Indicate contributions in your README.
Can I use my own data from research?
Yes, with approval. Make sure you can share enough for us to evaluate your work.
What if my model doesn’t work well?
That’s okay! Analyze why. A thoughtful analysis of failure is more valuable than inflated metrics.
How much should the field guide assume about the reader?
Assume the reader is a clinician with basic technical literacy — they know what “sensitivity” means but don’t know PyTorch.
Resources
Tips
- Start from HW7 — Your governance artifacts are a template for the field guide
- The field guide is not an afterthought — Budget significant time for it
- Get feedback early — Share drafts with classmates, come to office hours
- Be honest about limitations — We’re grading your analysis, not your AUC
- Practice your presentation with a timer — 12 minutes goes fast, and the cutoff is strict