ML Load predictor, external temperature components#3310
ML Load predictor, external temperature components#3310springfall2008 merged 22 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an ML-based load forecasting feature (NumPy MLP) plus an external temperature forecast component, wiring both into Predbat’s forecasting pipeline and UI/docs.
Changes:
- Introduces ML load predictor +
load_mlcomponent and publishes new HA sensors for forecast/stats. - Adds
temperaturecomponent using Open-Meteo forecasts, and uses it as an ML feature and for charting. - Updates fetch pipeline, web charts, and mkdocs documentation/navigation.
Reviewed changes
Copilot reviewed 17 out of 21 changed files in this pull request and generated 27 comments.
Show a summary per file
| File | Description |
|---|---|
| mkdocs.yml | Adds new ML load prediction page to docs nav |
| docs/load-ml.md | New ML load prediction documentation |
| docs/components.md | Documents new temperature and load_ml components |
| coverage/debug_predict.py | Debug helper for ML predictor behavior |
| coverage/debug_model.py | Debug helper for inspecting model weights/normalization |
| coverage/analyze_periods.py | Debug helper for analyzing dataset periods |
| coverage/analyze_data.py | Debug helper for basic dataset stats |
| apps/predbat/web.py | Adds LoadML charts and ML+PV+temperature visualization |
| apps/predbat/utils.py | Extends prune_today() for offset/future-window pruning |
| apps/predbat/unit_test.py | Registers new ML + temperature tests |
| apps/predbat/tests/test_temperature.py | New temperature component tests |
| apps/predbat/tests/test_minute_data_import_export.py | Updates call signature for minute_data_import_export() |
| apps/predbat/tests/test_load_ml.py | Comprehensive ML predictor + component tests |
| apps/predbat/temperature.py | New Open-Meteo temperature forecast component |
| apps/predbat/predbat.py | Version bump; adds temperature.py to PREDBAT_FILES |
| apps/predbat/load_predictor.py | New NumPy-only MLP predictor (train/predict/persist) |
| apps/predbat/load_ml_component.py | New component wrapper for data fetch/train/publish predictions |
| apps/predbat/fetch.py | Adds ML forecast integration; changes minute_data_import_export() signature |
| apps/predbat/config.py | Adds load_ml_enable schema entry |
| apps/predbat/components.py | Registers new temperature and load_ml components |
| .cspell/custom-dictionary-workspace.txt | Adds ML-related words for spellchecker |
| except Exception as e: | ||
| self.log("Error: ML Component: Failed to fetch load data: {}".format(e)) | ||
| print("Error: ML Component: Failed to fetch load data: {}".format(e)) | ||
| import traceback | ||
|
|
||
| self.log("Error: ML Component: {}".format(traceback.format_exc())) | ||
| return None, 0, 0, None, None |
There was a problem hiding this comment.
This exception handler logs via self.log(...) but also prints directly to stdout. Predbat uses self.log() for logging; print() can bypass HA/AppDaemon logging and is easy to miss. Remove the print() and rely on self.log() (the traceback is already logged).
apps/predbat/load_predictor.py
Outdated
| batch_weights = weights_shuffled[batch_start:batch_end] | ||
|
|
||
| # Forward pass | ||
| y_pred, activations, pre_activations = self._forward(X_batch) | ||
|
|
||
| # Apply sample weights to loss (approximate by weighting gradient) | ||
| weighted_y_batch = y_batch * batch_weights.reshape(-1, 1) | ||
| weighted_y_pred = y_pred * batch_weights.reshape(-1, 1) | ||
|
|
||
| batch_loss = mse_loss(y_batch, y_pred) | ||
| epoch_loss += batch_loss | ||
| num_batches += 1 | ||
|
|
||
| # Backward pass | ||
| weight_grads, bias_grads = self._backward(y_batch, activations, pre_activations) |
There was a problem hiding this comment.
Sample weights are computed/shuffled (train_weights, batch_weights) and even used to create weighted_y_batch/weighted_y_pred, but those weighted tensors are never used in the loss or gradient calculation. As a result, the intended time-decay weighting has no effect. Either incorporate batch_weights into the loss/gradient (e.g., weight the per-sample loss or scale delta), or remove the unused weighting code to avoid a false sense of correctness.
| # Validation uses most recent data (minute 0 to validation_holdout) | ||
| # Training uses ALL data (minute 0 to end_minute), including validation period | ||
| validation_end = validation_holdout_hours * 60 |
There was a problem hiding this comment.
The dataset builder explicitly includes the validation period in the training set (Training uses ALL data ... including validation period). This makes the reported validation MAE overly optimistic and can cause the component to accept models that don’t generalize. Hold out the validation window from training (or rename this metric to something like “fit_mae”) so the gating threshold reflects true out-of-sample performance.
apps/predbat/utils.py
Outdated
| @@ -54,18 +54,19 @@ def prune_today(data, now_utc, midnight_utc, prune=True, group=15, prune_future= | |||
| timekey = datetime.strptime(key, TIME_FORMAT_SECONDS) | |||
| else: | |||
| timekey = datetime.strptime(key, TIME_FORMAT) | |||
| if last_time and (timekey - last_time).seconds < group * 60: | |||
| if last_time and (timekey - last_time).total_seconds() < group * 60: | |||
| continue | |||
| if intermediate and last_time and ((timekey - last_time).seconds > group * 60): | |||
| if intermediate and last_time and ((timekey - last_time).total_seconds() > group * 60): | |||
| # Large gap, introduce intermediate data point | |||
| seconds_gap = int((timekey - last_time).total_seconds()) | |||
| for i in range(1, seconds_gap // int(group * 60)): | |||
| new_time = last_time + timedelta(seconds=i * group * 60) | |||
| results[new_time.strftime(TIME_FORMAT)] = prev_value | |||
| new_time = last_time + timedelta(seconds=i * group * 60) + timedelta(minutes=offset_minutes) | |||
| results[new_time.isoformat()] = prev_value | |||
| if not prune or (timekey > midnight_utc): | |||
| if prune_future and (timekey > now_utc): | |||
| if prune_future and (timekey > (now_utc + timedelta(days=prune_future_days))): | |||
| continue | |||
| results[key] = data[key] | |||
| new_time = timekey + timedelta(minutes=offset_minutes) | |||
| results[new_time.isoformat()] = data[key] | |||
| last_time = timekey | |||
There was a problem hiding this comment.
prune_today() now rewrites all timestamp keys using datetime.isoformat(), which changes key formatting and breaks callers/tests that rely on keys being preserved (e.g., apps/predbat/tests/test_prune_today.py asserts original keys are present). Consider preserving the original key strings when no shifting is requested, and when shifting/creating intermediate points, format new keys using the existing TIME_FORMAT / TIME_FORMAT_SECONDS conventions rather than isoformat() to avoid inconsistent timestamp formats across the app.
apps/predbat/fetch.py
Outdated
| if self.get_arg("load_ml_enable", False) and self.get_arg("load_ml_source", False): | ||
| load_ml_forecast = self.fetch_ml_load_forecast(self.now_utc) | ||
| self.load_forecast_only = True # Use only ML forecast for load if enabled |
There was a problem hiding this comment.
fetch_sensor_data() sets self.load_forecast_only = True whenever ML is enabled/configured, even if fetch_ml_load_forecast() returns an empty dict (e.g., ML entity missing/unavailable). That will skip historical filtering/adjustment paths gated by not self.load_forecast_only, potentially degrading planning. Only set load_forecast_only when a non-empty ML forecast was successfully loaded.
| if self.get_arg("load_ml_enable", False) and self.get_arg("load_ml_source", False): | |
| load_ml_forecast = self.fetch_ml_load_forecast(self.now_utc) | |
| self.load_forecast_only = True # Use only ML forecast for load if enabled | |
| # Default to using historical-based load forecast unless a valid ML forecast is loaded | |
| self.load_forecast_only = False | |
| if self.get_arg("load_ml_enable", False) and self.get_arg("load_ml_source", False): | |
| load_ml_forecast = self.fetch_ml_load_forecast(self.now_utc) | |
| # Only rely solely on ML forecast if we actually received one | |
| if load_ml_forecast: | |
| self.load_forecast_only = True # Use only ML forecast for load if enabled and available |
| Test training on real load_minutes_debug.json data and generate comparison chart | ||
| """ | ||
| import json | ||
| import os |
There was a problem hiding this comment.
This import of module os is redundant, as it was previously imported on line 15.
| import os |
| from datetime import datetime, timezone | ||
|
|
||
|
|
||
| class MockTemperatureAPI(TemperatureAPI): |
There was a problem hiding this comment.
This class does not call ComponentBase.init during initialization. (MockTemperatureAPI.init may be missing a call to a base class init)
| temperature_url=temperature_url | ||
| ) | ||
|
|
||
| def log(self, message): |
There was a problem hiding this comment.
This method is shadowed by attribute log in superclass ComponentBase.
| plt.savefig(chart_path, dpi=150, bbox_inches="tight") | ||
| print(f" Chart saved to {chart_path}") | ||
| break | ||
| except: |
There was a problem hiding this comment.
Except block directly handles BaseException.
| except: | |
| except Exception: |
apps/predbat/load_ml_component.py
Outdated
| self.load_minutes_now = load_minutes_now | ||
| self.data_ready = True | ||
| self.last_data_fetch = self.now_utc | ||
| pv_data = pv_data |
There was a problem hiding this comment.
This assignment assigns a variable to itself.
| pv_data = pv_data |
Web status update more rapid
No description provided.