Autors: Mihai Nan
We are given a dataset containing sequences of human movements, each composed of multiple frames (FrameNumber).
Each frame includes the 3D positions of 25 body joints, represented by the coordinates:
J1X, J1Y, J1Z, ..., J25X, J25Y, J25Z.
Each frame is associated with a sequence identifier (IDSample) and two labels:
Action — the action performedCamera — the camera that recorded the sequenceTo better understand the spatial distribution of the 25 joints:
For each unique IDSample, determine how many frames are available in the test dataset (test_data.csv).
This analysis helps understand the temporal-spatial distribution of the data for each action.
Using the training data, train a classification model capable of recognizing the action performed in a sequence of frames.
Apply the model to the test data and, for each unique IDSample, predict the most probable action.
The performance of the model will be evaluated using accuracy, defined as:
To obtain full points, the achieved accuracy must be at least 0.965.
Using the training data, train a model capable of predicting which camera recorded a given sequence of frames.
Apply the model to the test set and specify, for each IDSample, the camera predicted as most likely.
The model performance is again evaluated using accuracy, defined as:
To obtain full points, the achieved accuracy must be at least 0.8.
The final result must be a CSV file named output.csv, containing exactly 3 columns:
| subtaskID | datapointID | answer |
|---|---|---|
| 1 | IDSample from test_data.csv | corresponding answer for that subtask |