Skip to main content
Back to ONIA Winter Warmup Challenge 2025
Competition ended
Archive problem
Competition: ONIA Winter Warmup Challenge 2025

The National Rabbit Exhibition

Author: Mihai Nan

Medium
Your best score: N/A
Problem Description

The National Rabbit Exhibition

In a December month filled with large, quiet snowflakes, when the village still smelled of burnt wood and freshly lifted hay from the barns, the gates of the National Rabbit Exhibition opened in the heart of the country.
Inside the heated halls, glowing under warm yellow lights, breeders from all corners of the land brought with hope their most beautiful and best-groomed animals.

Specimens from three major rabbit breeds (each with their own well-established appearances and habits) were once again gathered in the same place.

Each animal received a tag with an ID, and in a thick leather-bound register, all signs that distinguished it from the others were carefully written down:

sex, weight, ear length, whether the ears were lopped or not, the color of the fur, age, type and quality of the fur, body shape, whether it had a dewlap or not, as well as overall health.

This was the organizers’ way of keeping a complete record of every furry soul that entered the hall.

Subtask 1 (20 points)

The organizers want to know how many of the female rabbits brought to the exhibition had lopped ears and wore the noble shade of havana.
The result must be determined based on the testing dataset (test_data.csv).

Nothing complicated — you just need to perform a careful search through the records to provide this answer.


Subtask 2 (40 points)

On a cold morning, with their breath still fogging in the air, the organizers discovered that an important page had gone missing from the official register.
The breed of each rabbit, so carefully recorded in previous years, could no longer be found anywhere.
It was certain that specimens from three different breeds had been brought, but the signs that directly identified them had vanished.

Those who knew rabbits well were asked to patiently observe all specimens and identify three natural groups, using only the noted traits: similarities, differences, clear boundaries, and areas where the animals seemed close or distant in appearance.
The result must be determined based on the testing dataset (test_data.csv).

Evaluation Metric

To assess how well the divisions into breeds match reality, the Adjusted Rand Index (ARI) is used. This metric measures the similarity between two partitions, adjusted for random chance.

The ARI coefficient between two partitions is defined as:

ARI formula

where:

  • RI is the Rand Index between the two partitions,
  • E[RI] is the expected Rand Index for random partitions.

If the resulting number is close to 1, it means the breed separation was done skillfully and the boundaries between breed characteristics were correctly identified. The exhibition organizers are very strict and award maximum points for this subtask only if the evaluation metric equals 1.


Subtask 3 (40 points)

That December, the number of participants was so large that the judges, despite their best efforts, could not score all the animals.
Only some of them received a Judging Score, between 0 and 100.

  • The score is a number between 0 and 100, where:
    • 0 means the specimen was disqualified,
    • 100 means the specimen could be declared a champion.

For the remaining rabbits, still unscored, a way had to be found to predict the score they would have received.
The goal was to create a system capable of understanding the relationships between the already scored rabbits and those without a score, so that the estimates would be as close as possible to what the judges would have decided.
The smaller the differences, the more skillful the system is considered — and the more helpful it becomes for people overwhelmed by the number of animals.

Your task is to develop an automatic score prediction system for the specimens included in test_data.csv.

Evaluation Metric

To evaluate model performance, we use MSE (Mean Squared Error) — the mean squared difference between true and predicted scores.

Formula:

MSE

where:

  • y_i is the true score of specimen i,
  • ŷ_i is the score predicted by the model,
  • n is the total number of specimens in the test set.

Smaller MSE values indicate better performance. The closer the predictions are to the real values, the smaller the error.

To convert the obtained score into points, we use a simple rule based on two thresholds:

  • If MSE < min → participant receives 40p (maximum score).
  • If MSE > max → participant receives 0p (minimum score).
  • If MSE is between min and max → the score is proportional, decreasing linearly from 40p to 0p as the error increases.

Output Format

The final result must be a CSV file named output.csv, containing exactly 3 columns:

  • subtaskID – the subtask number (1, 2, 3)
  • datapointID – referring to the ID column from the dataset
  • answer – the corresponding answer for that datapoint and subtask

Note: For Subtask 1, which requires a single answer for the entire test dataset, output only one line, with datapointID equal to 1.

We kindly ask you to get involved and help the organizers of the National Rabbit Exhibition:
a story about passion and performance, understanding among breeders, similarities and differences between breeds, and the effort to see the animal world with the patience that only a good breeder can have in the days before winter holidays.

Submit Solution
Upload output file and optionally source code for evaluation.

Submission File

Source Code File (optional)

Sign in to upload a submission.