Movie score prediction
Author: Mihai Nan
Medium
Your best score: N/A
Problem Description
🎬 Movie score prediction 🍿
In the world of streaming, services like Netflix rely on data to understand which movies and series users will appreciate. You are a consultant for a start-up that wants to predict a movie’s score based on its metadata.
You have access to a dataset of Netflix movies and series:
- train.csv – movies and series with known scores
- test.csv – new movies and series for which you must predict the score
📊 Dataset
Each row represents a movie title:
- SampleID – unique identifier of the movie
- Title – the name of the movie or series
- Type – content type (
SHOW) - Description – short description of the movie/series
- Year – release year
- Score – only in train.csv, numeric value representing the movie’s score (e.g., critics’ rating)
Your goal is to predict Score for each title in test.csv.
📝 Task (100 points)
Build a machine learning model capable of predicting the numeric value Score for each title in test.csv, using the available columns (Title, Type, Description, Year).
🧮 Evaluation
- The main metric is MAE (Mean Absolute Error):
- MAE ≤ 0.65 → 100 points
- MAE ≥ 2.0 → 0 points
- Intermediate values receive proportional scoring.
📄 Submission File Format
The submission.csv file must contain one row for each title in the test set:
SampleID, Score
where:
- SampleID – the movie identifier from the test set
- Score – the predicted numeric score