Skip to main content

The archive of drawn words

Author: Mihai Nan

Medium
Your best score: N/A
Problem Description

✍️ The Archive of Drawn Words 🖼️

A Visual Recognition Challenge

In a secret digital library known as
The Archive of Drawn Words, thousands of images of handwritten words are stored, each accompanied by a small illustration suggesting its meaning.

Unfortunately, a mysterious glitch affected the indexing system, and the labels of many images were lost.
Now, only a Master of Computer Vision can restore order and meaning to these images.

Exemple

Each image contains:

  • a handwritten word, in black ink on a white background
  • a small illustration representing the object or concept of that word

The style is realistic, clean, and clear, but the word must be recognized exclusively from the image.


🗂 Provided Data

You have access to the following files:

📁 train.csv

Contains labeled examples for training:

  • image_path – path to the image
  • label – the handwritten word in the image
    All images in train.csv are located in the output_dataset/train/ directory.

📁 test.csv

Contains:

  • image_path – path to the image

⚠️ The label column is missing and must be predicted by your model.

All images in test.csv are located in the output_dataset/test/ directory.


🧠 Words the Model Must Recognize

Your model must correctly recognize exactly one of the following 20 words:

apple, banana, cat, dog, elephant,
flower, house, moon, sun, tree,
violin, lion, kite, boat, star,
fish, pencil, cake, book, umbrella

Each image belongs to a single class.


🎯 Your Task

Build an image classification model that predicts the correct handwritten word for each image in test.csv.


📤 Submission File Format (submission.csv)

At the end, generate a submission.csv file with the following structure:

image_path,label
output_dataset/test/000001.png,banana
output_dataset/test/000002.png,cat
output_dataset/test/000003.png,tree

📊 Evaluation

The model's performance will be evaluated using accuracy:

accuracy = (number_of_correct_predictions / total_number_of_predictions)

🏅 Scoring System:

  • accuracy ≥ 98% → 100 points
  • accuracy ≤ 20% → 0 points
  • otherwise → linearly scaled between 0 and 100

🏆 Become the Master of the Archive of Drawn Words!

Every letter counts. Every line of ink hides a meaning.

Prove that your model can see, understand, and recognize the words in the images, restoring order to the Archive of Drawn Words! ✨📚

Submit Solution
Upload output file and optionally source code for evaluation.

Submission File

Source Code File (optional)

Sign in to upload a submission.