Prompt - Task¶
Construye un modelo LSTM vanilla en TensorFlow/Keras para predecir el número de usuarios diarios de un sitio web, carga el dataset desde 'https://github.com/marsgr6/r-scripts/blob/master/data/viz_data/visitas_por_dia_web_cienciadedatos.csv' usando pandas con parse_dates en la columna 'date' y dayfirst=True, usa la columna 'users' como variable objetivo, realiza análisis exploratorio mostrando la serie de tiempo completa, crea secuencias temporales con lookback window de 30 días, divide los datos respetando el orden temporal (80% de las primeras observaciones para entrenamiento, 20% de las últimas observaciones para prueba), normaliza los datos con MinMaxScaler, construye una arquitectura LSTM con capas [LSTM(50 unidades, return_sequences=False), Dropout(0.2), Dense(1)], compila con optimizer Adam y loss MSE, entrena el modelo con early stopping y validation_split=0.1, realiza predicciones en el conjunto de prueba, desnormaliza las predicciones y valores reales, calcula métricas de evaluación (MSE, RMSE, MAE, MAPE), y genera tres gráficos: (1) serie de tiempo completa mostrando train/test split con línea vertical divisoria, (2) zoom del período de prueba con valores observados vs predichos incluyendo fechas en el eje x, y (3) curvas de pérdida durante el entrenamiento (training y validation loss).
- Nota: Es posible que tenga que insistir a Gemini en la ejecución de las taras. Dividir el prompt en diferentes elementos puede funcionar mejor.
Reasoning: Import pandas, define the URL, load the data using pd.read_csv with date parsing, and display the head of the dataframe to verify.
import pandas as pd
url = 'https://github.com/marsgr6/r-scripts/raw/master/data/viz_data/visitas_por_dia_web_cienciadedatos.csv'
df = pd.read_csv(url, parse_dates=['date'], dayfirst=True)
display(df.head())
/tmp/ipython-input-2322786964.py:4: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format. df = pd.read_csv(url, parse_dates=['date'], dayfirst=True)
| date | users | |
|---|---|---|
| 0 | 2020-07-01 | 2324 |
| 1 | 2020-07-02 | 2201 |
| 2 | 2020-07-03 | 2146 |
| 3 | 2020-07-04 | 1666 |
| 4 | 2020-07-05 | 1433 |
Reasoning: Plot the 'users' column against the 'date' column to visualize the time series data and its behavior.
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(df['date'], df['users'])
plt.xlabel('Date')
plt.ylabel('Number of Users')
plt.title('Daily Website Users Over Time')
plt.show()
Reasoning: Import MinMaxScaler, select the target variable, normalize it, and create sequences using a helper function.
from sklearn.preprocessing import MinMaxScaler
import numpy as np
# Select the 'users' column and reshape
target_data = df['users'].values.reshape(-1, 1)
# Initialize and fit the scaler
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(target_data)
# Define the sequence creation function
def create_sequences(data, lookback):
X, y = [], []
for i in range(len(data) - lookback):
X.append(data[i:(i + lookback), 0])
y.append(data[i + lookback, 0])
return np.array(X), np.array(y)
# Create sequences with a lookback window of 30
lookback_window = 7
X, y = create_sequences(scaled_data, lookback_window)
print(f"Shape of input sequences (X): {X.shape}")
print(f"Shape of output values (y): {y.shape}")
Shape of input sequences (X): (414, 7) Shape of output values (y): (414,)
Reasoning: Determine the split index and divide the data into training and testing sets based on the 80/20 ratio while maintaining temporal order.
# Determine the split index (80% for training, 20% for testing)
split_index = int(len(X) * 0.8)
# Split the data into training and testing sets
X_train, X_test = X[:split_index], X[split_index:]
y_train, y_test = y[:split_index], y[split_index:]
# Print the shapes of the training and testing sets
print(f"Shape of X_train: {X_train.shape}")
print(f"Shape of y_train: {y_train.shape}")
print(f"Shape of X_test: {X_test.shape}")
print(f"Shape of y_test: {y_test.shape}")
Shape of X_train: (331, 7) Shape of y_train: (331,) Shape of X_test: (83, 7) Shape of y_test: (83,)
Reasoning: Import necessary libraries from TensorFlow and Keras, define the sequential model, add the LSTM layer with 50 units and return_sequences=False, add a Dropout layer with a rate of 0.2, and finally add a Dense layer with 1 unit for the output.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout, Dense
# Build the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=False, input_shape=(X_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(Dense(units=1))
# Print the model summary
model.summary()
/usr/local/lib/python3.12/dist-packages/keras/src/layers/rnn/rnn.py:199: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ lstm_1 (LSTM) │ (None, 50) │ 10,400 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 50) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 1) │ 51 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 10,451 (40.82 KB)
Trainable params: 10,451 (40.82 KB)
Non-trainable params: 0 (0.00 B)
Reasoning: Compile the model using the Adam optimizer and Mean Squared Error (MSE) as the loss function.
# Compile the model
model.compile(optimizer='adam', loss='mse')
# Print a confirmation message
print("Model compiled successfully!")
Model compiled successfully!
Reasoning: Import EarlyStopping, define the early stopping callback, and fit the model to the training data with the specified parameters.
from tensorflow.keras.callbacks import EarlyStopping
# Define the Early Stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
# Train the model
history = model.fit(
X_train, y_train,
epochs=100, # You can increase the number of epochs, Early Stopping will stop training when necessary
batch_size=32,
validation_split=0.1,
callbacks=[early_stopping]
)
print("Model training finished.")
Epoch 1/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 4s 54ms/step - loss: 0.0850 - val_loss: 0.0234 Epoch 2/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0158 - val_loss: 0.0138 Epoch 3/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0160 - val_loss: 0.0094 Epoch 4/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0140 - val_loss: 0.0126 Epoch 5/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0124 - val_loss: 0.0084 Epoch 6/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0129 - val_loss: 0.0087 Epoch 7/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0099 - val_loss: 0.0094 Epoch 8/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0105 - val_loss: 0.0087 Epoch 9/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0117 - val_loss: 0.0085 Epoch 10/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0102 - val_loss: 0.0085 Epoch 11/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0098 - val_loss: 0.0089 Epoch 12/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0096 - val_loss: 0.0085 Epoch 13/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0102 - val_loss: 0.0085 Epoch 14/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0113 - val_loss: 0.0095 Epoch 15/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - loss: 0.0096 - val_loss: 0.0083 Epoch 16/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0106 - val_loss: 0.0088 Epoch 17/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0095 - val_loss: 0.0084 Epoch 18/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0111 - val_loss: 0.0083 Epoch 19/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0107 - val_loss: 0.0085 Epoch 20/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0093 - val_loss: 0.0082 Epoch 21/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.0113 - val_loss: 0.0083 Epoch 22/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0111 - val_loss: 0.0082 Epoch 23/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0089 - val_loss: 0.0082 Epoch 24/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0099 - val_loss: 0.0083 Epoch 25/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0111 - val_loss: 0.0080 Epoch 26/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0099 - val_loss: 0.0085 Epoch 27/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0099 - val_loss: 0.0081 Epoch 28/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0110 - val_loss: 0.0083 Epoch 29/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 0.0106 - val_loss: 0.0080 Epoch 30/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0116 - val_loss: 0.0085 Epoch 31/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0107 - val_loss: 0.0079 Epoch 32/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0089 - val_loss: 0.0080 Epoch 33/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0100 - val_loss: 0.0080 Epoch 34/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0087 - val_loss: 0.0081 Epoch 35/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0083 - val_loss: 0.0077 Epoch 36/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0109 - val_loss: 0.0077 Epoch 37/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0090 - val_loss: 0.0077 Epoch 38/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0100 - val_loss: 0.0076 Epoch 39/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0101 - val_loss: 0.0075 Epoch 40/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0094 - val_loss: 0.0076 Epoch 41/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0105 - val_loss: 0.0075 Epoch 42/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0100 - val_loss: 0.0083 Epoch 43/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0100 - val_loss: 0.0074 Epoch 44/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0090 - val_loss: 0.0074 Epoch 45/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0095 - val_loss: 0.0072 Epoch 46/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0082 - val_loss: 0.0071 Epoch 47/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0087 - val_loss: 0.0070 Epoch 48/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0088 - val_loss: 0.0071 Epoch 49/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0091 - val_loss: 0.0070 Epoch 50/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0100 - val_loss: 0.0082 Epoch 51/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0108 - val_loss: 0.0078 Epoch 52/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0097 - val_loss: 0.0071 Epoch 53/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0080 - val_loss: 0.0067 Epoch 54/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0095 - val_loss: 0.0066 Epoch 55/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0087 - val_loss: 0.0081 Epoch 56/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0090 - val_loss: 0.0065 Epoch 57/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0081 - val_loss: 0.0065 Epoch 58/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 0.0098 - val_loss: 0.0064 Epoch 59/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0078 - val_loss: 0.0062 Epoch 60/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - loss: 0.0089 - val_loss: 0.0061 Epoch 61/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0090 - val_loss: 0.0061 Epoch 62/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0080 - val_loss: 0.0085 Epoch 63/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0108 - val_loss: 0.0064 Epoch 64/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0091 - val_loss: 0.0071 Epoch 65/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0089 - val_loss: 0.0065 Epoch 66/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0082 - val_loss: 0.0063 Epoch 67/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0093 - val_loss: 0.0063 Epoch 68/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0087 - val_loss: 0.0062 Epoch 69/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0083 - val_loss: 0.0067 Epoch 70/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step - loss: 0.0079 - val_loss: 0.0058 Epoch 71/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step - loss: 0.0091 - val_loss: 0.0057 Epoch 72/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0089 - val_loss: 0.0053 Epoch 73/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step - loss: 0.0089 - val_loss: 0.0058 Epoch 74/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0077 - val_loss: 0.0054 Epoch 75/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0078 - val_loss: 0.0052 Epoch 76/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step - loss: 0.0068 - val_loss: 0.0052 Epoch 77/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.0078 - val_loss: 0.0052 Epoch 78/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step - loss: 0.0068 - val_loss: 0.0063 Epoch 79/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step - loss: 0.0091 - val_loss: 0.0048 Epoch 80/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step - loss: 0.0093 - val_loss: 0.0049 Epoch 81/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0076 - val_loss: 0.0049 Epoch 82/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0082 - val_loss: 0.0051 Epoch 83/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0063 - val_loss: 0.0046 Epoch 84/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0074 - val_loss: 0.0045 Epoch 85/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0081 - val_loss: 0.0050 Epoch 86/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0090 - val_loss: 0.0045 Epoch 87/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0068 - val_loss: 0.0041 Epoch 88/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0077 - val_loss: 0.0046 Epoch 89/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 0.0078 - val_loss: 0.0049 Epoch 90/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0067 - val_loss: 0.0042 Epoch 91/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0074 - val_loss: 0.0039 Epoch 92/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0070 - val_loss: 0.0038 Epoch 93/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0064 - val_loss: 0.0044 Epoch 94/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0073 - val_loss: 0.0037 Epoch 95/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 0.0067 - val_loss: 0.0037 Epoch 96/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0065 - val_loss: 0.0040 Epoch 97/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 0.0059 - val_loss: 0.0035 Epoch 98/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0066 - val_loss: 0.0037 Epoch 99/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0067 - val_loss: 0.0033 Epoch 100/100 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.0065 - val_loss: 0.0034 Model training finished.
Reasoning: Plot the training and validation loss from the model's training history to visualize the learning process.
# Plot the training and validation loss
plt.figure(figsize=(12, 6))
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss During Training')
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.legend()
plt.show()
# Make predictions on the test set
predictions = model.predict(X_test)
print("Predictions on the test set have been generated.")
3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 82ms/step Predictions on the test set have been generated.
# Inverse transform the predictions and actual values
predictions = scaler.inverse_transform(predictions)
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))
print("Predictions and actual values have been denormalized.")
Predictions and actual values have been denormalized.
Reasoning: Use the fitted scaler to inverse transform the scaled predictions and the actual y_test values to their original scale.
Reasoning: Use the trained model to predict the 'users' values on the testing set (X_test).
Reasoning: Import necessary metrics from sklearn.metrics and calculate Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) using the denormalized actual values and predictions.
from sklearn.metrics import mean_squared_error, mean_absolute_error
import numpy as np
# Calculate evaluation metrics
mse = mean_squared_error(y_test_actual, predictions)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test_actual, predictions)
mape = np.mean(np.abs((y_test_actual - predictions) / y_test_actual)) * 100
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Absolute Percentage Error (MAPE): {mape:.2f}%")
Mean Squared Error (MSE): 150132.61 Root Mean Squared Error (RMSE): 387.47 Mean Absolute Error (MAE): 273.05 Mean Absolute Percentage Error (MAPE): 28.09%
Visualizar resultados¶
Subtask:¶
Generar los tres gráficos solicitados: (1) serie de tiempo completa mostrando train/test split con línea vertical divisoria, (2) zoom del período de prueba con valores observados vs predichos incluyendo fechas en el eje x, y (3) curvas de pérdida durante el entrenamiento (training y validation loss).
Reasoning: Plot the full time series, highlighting the training and testing periods with different colors and adding a vertical line to indicate the split point.
import matplotlib.pyplot as plt
# Get the dates for the test set
test_dates = df['date'][lookback_window + split_index:].reset_index(drop=True)
plt.figure(figsize=(15, 7))
plt.plot(df['date'], df['users'], label='Full Time Series')
plt.plot(df['date'][:lookback_window + split_index], df['users'][:lookback_window + split_index], label='Training Data', color='blue')
plt.plot(df['date'][lookback_window + split_index:], df['users'][lookback_window + split_index:], label='Testing Data', color='orange')
plt.plot(test_dates, predictions, label='Predicted Users', color='red', linestyle='--')
# Add a vertical line to show the split point
split_date = df['date'][lookback_window + split_index -1]
plt.axvline(split_date, color='red', linestyle='--', label='Train/Test Split')
plt.xlabel('Date')
plt.ylabel('Number of Users')
plt.title('Daily Website Users: Train and Test Split')
plt.legend()
plt.show()
Reasoning: Plot the denormalized actual values against the denormalized predictions for the test set, using the test dates on the x-axis for better context.
plt.figure(figsize=(15, 7))
plt.plot(test_dates, y_test_actual, label='Observed Users', color='blue')
plt.plot(test_dates, predictions, label='Predicted Users', color='orange', linestyle='--')
plt.xlabel('Date')
plt.ylabel('Number of Users')
plt.title('Observed vs Predicted Daily Website Users (Test Set)')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()