Solar Energy Production Prediction
Автор: Mihai Nan
🌞 Solar Energy Production Prediction
Problem Description
The goal is to build a regression model that predicts the daily electrical energy production (kWh) of a solar panel, based on meteorological conditions and installation characteristics.
Each sample represents a production day and is characterized by several numerical attributes, such as light intensity, air temperature, wind speed, and others.
The target label (energy_output) represents the total energy generated on that day.
This problem belongs to the univariate regression category.
🔹 Features
solar_irradiance– average solar radiation (W/m²)temperature– average air temperature (°C)humidity– relative humidity (%)wind_speed– average wind speed (m/s)cloud_cover– average cloud cover (%)panel_angle– panel tilt angle (°)panel_efficiency– panel efficiency (%)
📘 Input File Structure
train.csv
Contains all feature columns plus the energy_output column, which represents the target value.
Example:
| SampleID | solar_irradiance | temperature | humidity | wind_speed | cloud_cover | panel_angle | panel_efficiency | energy_output |
|---|---|---|---|---|---|---|---|---|
| 1 | 750.5 | 25.2 | 40.0 | 3.5 | 10 | 30 | 18.5 | 42.3 |
| 2 | 610.0 | 22.1 | 55.0 | 2.0 | 50 | 25 | 17.0 | 28.7 |
test.csv
Contains the same columns as train.csv, but without energy_output, and includes SampleID.
📤 Submission
The output file (submission.csv) must contain exactly two columns:
SampleIDenergy_output– the value predicted by the model (float, with 2 decimal places)
Example:
| SampleID | energy_output |
|---|---|
| 1 | 41.75 |
| 2 | 29.10 |
| 3 | 35.80 |
⚙️ Evaluation
Model evaluation will be performed using Root Mean Squared Error (RMSE):
where N is the number of examples in the test set, y_i is the real value and y^_i is the value predicted by the model.
The final score will be scaled between 0 and 100, so that low RMSE leads to high score.
📊 Source
The data used for this problem is synthetically generated.