Residency Exam

Author: Mihai Nan

Hard

Your best score: N/A

Problem Description

🩺 Residency Exam 🤖📘

Every year, thousands of medical graduates prepare for the most difficult moment of their careers: the Residency Exam. For months, future doctors memorize, review, and solve hundreds of multiple-choice questions.

But this year, the Central Committee decided to introduce a major innovation: an automated platform that checks and evaluates answers using machine learning.
Unfortunately, the system prototype started producing errors, and the committee needs your help to fix it.

You have been given two files:

train.csv — official questions with the correct answer
test.csv — new questions for which you must predict the correct option (some of these will be selected for the actual residency exam 😅)

Your goal is to rebuild the automatic correction mechanism.

📊 Dataset

Each row represents a multiple-choice exam question:

SampleID – unique identifier of the question
Question – the question text
Option0, Option1, Option2, Option3 – the four answer choices
Answer – only in train.csv (0–3), indicating the correct answer

📝 Task (100 points)

Build a machine-learning model capable of predicting, for each question in test.csv, which of the four options (0–3) is the correct answer.

Your model will be evaluated using accuracy:

Accuracy ≥ 70% → 100 points
Accuracy ≤ 25% → 0 points
Intermediate values are scored proportionally.

Any method is allowed: classic ML algorithms, embeddings, language models, medical BERT, etc.

📄 Submission Format

The submission.csv file must contain one row for each question in the test set.

The first line should be:

DatapointID, PredictedAnswer

where:

DatapointID — the SampleID from the test set
PredictedAnswer — a number between 0 and 3 (the predicted correct option)

Example (SampleID = `84f328d3-fca4-422d-8fb2-19d55eb31503`):

84f328d3-fca4-422d-8fb2-19d55eb31503, 2

Files

Submit Solution

Upload output file and optionally source code for evaluation.

Submission File

Click to upload or drag and drop

CSV, ZIP, etc. (MAX. 100MB)

Source Code File (optional)