Releases: vergauwenthomas/MetObs_toolkit
v1.0.2
What's Changed
- v1.0.0 (Stable version, startpoint of backwards compatibility) by @vergauwenthomas in #604
- Bugfix nat in buddy check by @vergauwenthomas in #614
- Dependency audit by @vergauwenthomas in #615
Full Changelog: v0.4.7...v1.0.2
v1.0.0
What's Changed
- Gap filling fix by @ADRIE-A3 in #589
- Fix filter to modeldata bug by @vergauwenthomas in #592
- simple pd plot functionality by @vergauwenthomas in #595
- Add min_value and max_value parameters to gap-filling methods to prevent unphysical data by @Copilot in #598
- Geemap fix by @vergauwenthomas in #601
- Priority groups by @vergauwenthomas in #602
- Buddy check with multiple safetynets by @vergauwenthomas in #603
- Argument name and format consistency by @vergauwenthomas in #606
- Configclass by @vergauwenthomas in #607
- figure default settings in Settings class by @vergauwenthomas in #608
Full Changelog: v0.4.7...v1.0.0
v0.4.7
Summary
This release contains a set of improvements, bug fixes, API refinements and tests. Key themes: gap-filling refactor and robustness improvements, new plotting helpers (pandas-backed), better modeldata filtering and selection, GEE authentication/test helpers, logging hardening, improved distance-matrix/buddy-check logic, and added min/max constraints to gap-fills.
Version bump: 0.4.6 → 0.4.7
Highlights
Gap handling refactor
A gap overview API (gap_overview_df / gap_status_overview_df) providing concise, one-row-per-gap summaries at SensorData / Station / Dataset levels.
Default and validation behaviour changed: gap-size checks added and new parameter max_gap_duration_to_fill controls whether a gap is allowed to be filled (defaults adjusted to make behaviour more intuitive).
New gap statuses and logic: a "partially successful gapfill" status is introduced; gap flagging logic updated to treat partially successful gaps more intuitively for sequential gapfilling.
Many gapfill methods refactored to accept/propagate max_gap_duration_to_fill and optional min_value / max_value constraints.
Gap-filling value constraints
New support for min_value and max_value in core gap-fill paths (raw, debiased, diurnal debiased, weighted diurnal).
Filled values can be clipped to prevent unphysical results; tests added for these constraints.
Internal fill functions were updated to accept min/max (e.g. fill_regular_debias, fill_with_diurnal_debias, fill_with_weighted_diurnal_debias).
Model-data selection & plotting
New helper filter_modeldatadf for robust filtering of the modeldata DataFrame by obstype, modelname, modelvariable; used internally by plotting functions.
Station/Dataset plotting improvements:
modeldata_name and modeldata_kwargs added to make_plot to select specific modeldata series for plotting.
A new parameter modeltype adds the ability to select a different model data "type" than the obstype if needed (defaults to obstype).
New convenient pandas-backed plotting helpers:
ModelTimeSeries.pd_plot (wrapper around pandas.Series.plot for model timeseries)
SensorData.pd_plot (wrapper around pandas.Series.plot for sensordata, with label filtering support)
Plotting internals refactored to expose these simpler pd-plot entrypoints.
Tests and baselines for the new pd plots and modeldata plotting added.
New and improved utilities
convert_to_numeric_series added (and integrated into dataset and sensordata import paths) to handle values that use comma as decimal separator.
Timestamp and xarray conversion fixes: timedelta and timestamp attrs serialized in xarray conversions; improved netCDF engine handling (netcdf4 selected by default unless overridden) to avoid Unicode issues.
New dev/test tooling files added for GEE: a script to test GEE authentication environment (deployment/test_gee_auth.py) and updates to CI/dev pipeline scripts.
GEE and geemap
GEE initialization/auth flow improved: try default initialization first; if that fails, fall back to authenticate. Added handling/tests for known EarthEngine/gee changes in the test pipeline.
Dependency pinning: earthengine-api pinned to <=1.6.11 due to compatibility with geemap 0.35.3.
Logging improvements
Logging module now avoids creating duplicate FileHandlers / StreamHandlers. Existing handlers are checked for duplicate filepath/level before adding new handlers.
Buddy check & distance matrix
Buddy-check fixes: bug fixes and improved messaging when joining duplicate messages in the buddy-check loop; new tests added to cover edge cases.
Distance matrix now uses BallTree with haversine metric for better performance and correctness at scale. A separate helper generate_distance_matrix was added.
Docs, examples and tests
Bug fixes (representative)
Fixed gap-filling logic edge cases and gap-size validation (avoid filling overly large gaps by default).
Handled unicode / netcdf engine issues when saving netCDF (default to netcdf4).
Fixed bug in filtering of model data frame used for plotting and selection.
Fixed buddy-check duplicate message/iteration bug and added tests that reproduce triggers.
Fixed handling of comma-as-decimal when importing datasets.
Fixed geemap-related test and notebook display issues (closing figures after comparison).
API / Behaviour changes (important for users)
Station.modeldata: function/return types and usage were adjusted. Model data selection APIs were improved; a helper filter_modeldatadf was added to reliably extract model rows from the model datadf. Check your code if you iterate over station.modeldata or used its type expectations.
New/pushed parameters and renamed args:
Most gapfill and interpolation methods changed from "max_consec_fill" (count-based) to "max_gap_duration_to_fill" (duration-based, independent of dt resolution). Defaults changed (common defaults set to 3h for interpolation and 12h for model-based fills).
Many Dataset/Station/SensorData gapfill methods now accept optional min_value and max_value arguments (to constrain filled values).
Dataset/Station/SensorData now expose gap_overview_df methods (returning a compact per-gap summary).
ModelTimeSeries.pd_plot and SensorData.pd_plot now exist as convenience wrappers.
GEE: connect_to_gee flow attempts initialization first, and authenticates only if necessary. Tests added to check local credential presence.
Migration guide (suggested)
If you previously used max_consec_fill:
Replace usages with max_gap_duration_to_fill; pass a pandas Timedelta or string like "3h" (e.g. max_gap_duration_to_fill="3h" or pd.Timedelta("3h")).
Example: dataset.interpolate_gaps(..., max_gap_duration_to_fill="3h")
To limit filled values:
Pass min_value and/or max_value to fill_gaps_with_raw_modeldata, fill_gaps_with_debiased_modeldata, fill_gaps_with_diurnal_debiased_modeldata, fill_gaps_with_weighted_diurnal_debiased_modeldata.
For plotting:
Use the new pd_plot helpers for quick plots: my_modeltimeseries.pd_plot(...) and my_sensordata.pd_plot(show_labels=["ok"], **kwargs).
To choose specific model data series in make_plot, use modeldata_name or modeldata_kwargs.
For selecting modeldata rows from the combined DataFrame:
Use filter_modeldatadf(modeldatadf, trgobstype, modelname, modelvariable) to robustly get the intended subset.
If you relied on the old Dataset/Station.gaps API for "singular_gaps", switch to gap_overview_df/gap_status_overview_df for single-row-per-gap summaries.
Dependency notes
earthengine-api: pinned to <= 1.6.11 due to geemap compatibility (geemap 0.35.3).
geemap >= 0.35.3 required.
Minor updates across docs/testing tooling.
Developer / internal notes
Contributors (from commit co-authors)
Thomas Vergauwen
Leon Adriaensen (@ADRIE-A3)
Copilot / automated/code-assist contributions mentioned in commit history
v0.4.6
Release Notes - MetObs_toolkit v0.4.6
Note: v0.4.5 does not exist. (It is missing because of an installation bug on Py3.10, PyPi restrictions force me to skip that release.)
Release Highlights:
This release delivers enhancements, bug fixes, and improved robustness for the MetObs_toolkit. It focuses on better data handling, new plotting functionalities and fixes for various edge-cases.
🚀 New Features & Enhancements
- Data Import Robustness:
- Added support for comma as a decimal symbol when importing data.
- Introduced
convert_to_numeric_seriesfor safer numeric conversions, replacing direct.astypecalls.
- Plotting Improvements:
- The
make_plot()method of thestationsclass now supports:modeldata_namevariable for easier model series selection.modeldata_kwargsto select specific modeldata series.- New
modeltypeparameter to plot different types of modeldata independently fromobstype.
- The
- Site Metadata Enrichment:
- Added
lcz(Local Climate Zone) andaltitudeas attributes ofsite. - These are now included in the API documentation.
- Added
- Quality Control (QC) Improvements:
- Enhanced buddy check:
- More informative error messages with iteration reference.
- Fix for duplicate messages by joining them.
- Added tests for relevant edge cases.
- Enhanced buddy check:
🐛 Bug Fixes
- Fixed bug in test baselines and ensured correct location for baseline data.
- Fixed bug where
altitudebeingNaNcould cause processing errors. - Fixed bugs in tests and improved test coverage.
- Addressed Sphinx warnings in the documentation.
- Resolved several grammar errors in code comments.
🧪 Testing & Maintenance
- Added and improved tests for plotting and QC edge-cases.
- Updated test baselines for more robust regression checking.
- Black formatting and code style improvements across multiple modules.
🔢 Versioning
- Version set to v0.4.6.
New Contributors
Full Changelog: v0.4.4...v0.4.6
v0.4.4
Release name: v0.4.4 Tag: v0.4.4 Compare: changes since v0.4.3
Highlights
Data IO and formats
Parquet reader support added. (#557)
New to_parquet and to_csv methods for Dataset and Station classes. (#556)
CF-compliant netCDF serialization for xarray Datasets with nested attributes. (#558)
Model data improvements
ModelTimeseries unit conversion handling and ModelObstype renaming for clearer semantics and consistency. (#543, #545)
Robustness and correctness
Fix for NaTType error in frequency estimation when variable list is empty. (#562) — thanks to @ADRIE-A3 for reporting (#561).
Safer gapfilling invocation by checking stations for obstype when GF is called on Dataset. (#566)
Standardize runtime warnings by converting them to structured logging. (#565)
Improved QC error handling on Dataset. (#560)
Developer experience and docs
Human-readable repr methods for main classes to aid debugging and inspection. (#568)
README updated to include conda install instructions and badge. (#555)
Potential behavior changes
Renamed/standardized “ModelObstype” naming and unit-conversion handling for model time series. Downstream user code referencing the old name or implicit conversions may need to adapt. (#543, #545)
Closed issues addressed in this release window
error importing data, NaTType in frequency for empty variable list — reported by @ADRIE-A3, fixed via (#562). (#561)
template_build_prompt() to accept arguments — opened by @pratiman-91. (#551)
Use of pint for units and conversion — opened by @pratiman-91. (#549)
Update docs to latest version — opened by @pratiman-91. (#547)
Update Repo About information — opened by @pratiman-91. (#548)
Contributors (thank you!)
Code contributions:
@vergauwenthomas (#543, #545, #560, #566)
@pratiman-91 (#555, #557)
@Copilot (app/bot) (#556, #558, #562, #565, #568)
Issue reporters:
@ADRIE-A3 (#561)
@pratiman-91 (#547, #548, #549, #551)
Included pull requests (since v0.4.3)
#543 — Modeltimeseries unit conv handling and modelobstype renaming. (@vergauwenthomas)
#545 — Modeltimeseries unit conv. (@vergauwenthomas)
#555 — Update README.md to include conda install and badge. (@pratiman-91)
#556 — Add to_parquet and to_csv methods for Dataset and Station classes. (@Copilot)
#557 — Parquet reader. (@pratiman-91)
#558 — Implement CF-compliant netCDF serialization for xarray Datasets with nested attributes. (@Copilot)
#560 — Qc on dataset error handling. (@vergauwenthomas)
#562 — Fix NaTType error in frequency estimation for empty variable lists. (@Copilot)
#565 — Standardize warning formatting by converting operational warnings to logging. (@Copilot)
#566 — Check stations for obstype when GF is called on Dataset. (@vergauwenthomas)
#568 — Implement human-readable repr methods for all main classes. (@Copilot)
v0.4.3
What's Changed
- v0.4.0 by @vergauwenthomas in #501
- Lcz safetynet by @vergauwenthomas in #505
- Gee extraction of dataset where some stations have no coordinates by @vergauwenthomas in #513
- Min three timestamps for freq est by @vergauwenthomas in #512
- Patches and bugfixes for v0.4.1 by @vergauwenthomas in #518
- Add_functionallity in all classes by @vergauwenthomas in #516
- Minor_patches by @vergauwenthomas in #519
- Add functionallity by @vergauwenthomas in #520
- Add_sensordata-functionality by @vergauwenthomas in #523
- Export_to_xarray by @vergauwenthomas in #525
- Sturcture_to_PyPA by @vergauwenthomas in #526
- Add comprehensive instructions for AI agent, code style, notebook, and testing by @vergauwenthomas in #530
- implement the seamask fix for LCZ by @vergauwenthomas in #531
- Add model data tests and update workflow to include new test file by @vergauwenthomas in #533
- code improvements by @vergauwenthomas in #536
- Modeldata extend by @vergauwenthomas in #539
Full Changelog: v0.4.0...v0.4.3
v0.4.0
What's Changed
- include a 'metadata-only' way of the Dataset. by @vergauwenthomas in #472
- small patches for v0.3.0 by @vergauwenthomas in #471
- publish action fix by @vergauwenthomas in #476
- include argument to specify colors in timeseriesplot with colorby=names by @vergauwenthomas in #479
- station specific style kwargs for timeseries plots by @vergauwenthomas in #482
- limit cartopy version so py39 is still valid by @vergauwenthomas in #484
- Version changed to "0.3.0a" & Fix threshold issue by @Im-Arth1307 in #481
- Add reference to gapfilling paper of Amber by @vergauwenthomas in #486
- Feat/chaining filling methods by @NoahVerhoeven in #492
- Workflows for v0.4 by @vergauwenthomas in #493
New Contributors
- @NoahVerhoeven made their first contribution in #492
Full Changelog: v0.3.0...v0.4.0
v0.4.0a
What's Changed
- include a 'metadata-only' way of the Dataset. by @vergauwenthomas in #472
- small patches for v0.3.0 by @vergauwenthomas in #471
- publish action fix by @vergauwenthomas in #476
- include argument to specify colors in timeseriesplot with colorby=names by @vergauwenthomas in #479
- station specific style kwargs for timeseries plots by @vergauwenthomas in #482
- limit cartopy version so py39 is still valid by @vergauwenthomas in #484
- Version changed to "0.3.0a" & Fix threshold issue by @Im-Arth1307 in #481
- Add reference to gapfilling paper of Amber by @vergauwenthomas in #486
- Feat/chaining filling methods by @NoahVerhoeven in #492
- Workflows for v0.4 by @vergauwenthomas in #493
New Contributors
- @Im-Arth1307 made their first contribution in #481
- @NoahVerhoeven made their first contribution in #492
Full Changelog: v0.3.0...v0.4.0a
v0.3.0
The following parts are (major) revised:
- Gaps: There are no missing observations anymore. All that is missing, is considered a gap.
- gap filling: Multiple methods with different complexity for filling with modeldata
- Template: Templates are now stored as JSON files, and in a dedicated class.
- Modeldata: Modeldata has a specific class for static and dynamic datasets
- Documentation: The API now has examples for all user-accessible functions and methods.
What's Changed
- Split Dataset over multiple thematic modules by @vergauwenthomas in #466
- Gap refactor v0.3.0 by @vergauwenthomas in #467
Full Changelog: v0.2.1...v0.3.0
v0.2.1
Templates are handled by Template() and json file used to store templates.
What's Changed
- Template builder fixes by @vergauwenthomas in #462
- Template class by @vergauwenthomas in #463
- update template in doc api by @vergauwenthomas in #465
Full Changelog: v0.2.0...v0.2.1