Forfatter: Mihai Nan
The goal is to build a classification model that determines the type of wine based on its chemical characteristics.
Each sample is characterized by 13 numerical attributes that describe chemical and physical properties of the wine, and the label (target) indicates the wine variety from which it originates.
This type of problem belongs to the multi-class classification category.
alcoholmalic_acidashalcalinity_of_ashmagnesiumtotal_phenolsflavanoidsnonflavanoid_phenolsproanthocyaninscolor_intensityhueod280/od315_of_diluted_winesprolineThe dataset comes from the original UCI Machine Learning Repository collection:
https://archive.ics.uci.edu/ml/datasets/Wine
train.csvContains all 13 features columns plus the column:
target – represents the wine class (variety).1, 2 and 3.Example:
SampleID alcohol malic_acid ... od280/od315_of_diluted_wines proline target
0 37 13.28 1.64 ... 2.78 880.0 0
1 31 13.73 1.50 ... 2.71 1285.0 0
2 27 13.39 1.77 ... 3.22 1195.0 0
3 13 13.75 1.73 ... 2.90 1320.0 0
4 149 13.32 3.24 ... 1.62 650.0 2
test.csvContains the same columns without target, but includes SampleID.
Example:
SampleID alcohol malic_acid ... hue od280/od315_of_diluted_wines proline
0 11 14.10 2.16 ... 1.25 3.17 1510.0
1 135 12.51 1.24 ... 0.75 1.51 650.0
2 29 13.87 1.90 ... 1.25 3.40 915.0
3 122 11.56 2.05 ... 0.93 3.69 465.0
4 63 13.67 1.25 ... 1.23 2.46 630.0
The output file (submission.csv) must contain exactly two columns:
SampleIDlabel – the label predicted by the model (1, 2 or 3)Example:
| SampleID | label |
|---|---|
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
Model evaluation will be performed using the following metric:
This metric is suitable for multi-class classification because it gives equal weight to each class, regardless of the number of examples in each.
General formula:
where C is the number of classes, and F1_i is the F1 score for class i.
The final score is expressed as a percentage (0–100), rounded to two decimal places.
The dataset comes from the original collection:
UCI Machine Learning Repository – Wine Data Set