The archive of drawn words

Author: Mihai Nan

Medium

Your best score: N/A

Problem Description

✍️ The Archive of Drawn Words 🖼️

✨ A Visual Recognition Challenge

In a secret digital library known as
The Archive of Drawn Words, thousands of images of handwritten words are stored, each accompanied by a small illustration suggesting its meaning.

Unfortunately, a mysterious glitch affected the indexing system, and the labels of many images were lost.
Now, only a Master of Computer Vision can restore order and meaning to these images.

Exemple

Each image contains:

a handwritten word, in black ink on a white background
a small illustration representing the object or concept of that word

The style is realistic, clean, and clear, but the word must be recognized exclusively from the image.

🗂 Provided Data

You have access to the following files:

📁 `train.csv`

Contains labeled examples for training:

image_path – path to the image
label – the handwritten word in the image
All images in train.csv are located in the output_dataset/train/ directory.

📁 `test.csv`

Contains:

image_path – path to the image

⚠️ The label column is missing and must be predicted by your model.

All images in test.csv are located in the output_dataset/test/ directory.

🧠 Words the Model Must Recognize

Your model must correctly recognize exactly one of the following 20 words:

apple, banana, cat, dog, elephant,
flower, house, moon, sun, tree,
violin, lion, kite, boat, star,
fish, pencil, cake, book, umbrella

Each image belongs to a single class.

🎯 Your Task

Build an image classification model that predicts the correct handwritten word for each image in test.csv.

📤 Submission File Format (`submission.csv`)

At the end, generate a submission.csv file with the following structure:

image_path,label
output_dataset/test/000001.png,banana
output_dataset/test/000002.png,cat
output_dataset/test/000003.png,tree

📊 Evaluation

The model's performance will be evaluated using accuracy:

accuracy = (number_of_correct_predictions / total_number_of_predictions)

🏅 Scoring System:

accuracy ≥ 98% → 100 points
accuracy ≤ 20% → 0 points
otherwise → linearly scaled between 0 and 100

🏆 Become the Master of the Archive of Drawn Words!

Every letter counts. Every line of ink hides a meaning.

Prove that your model can see, understand, and recognize the words in the images, restoring order to the Archive of Drawn Words! ✨📚

Files

Submit Solution

Upload output file and optionally source code for evaluation.

Submission File

Click to upload or drag and drop

CSV, ZIP, etc. (MAX. 100MB)

Source Code File (optional)