CI/CD infrastructure and testing improvements#18
CI/CD infrastructure and testing improvements#18bernalde wants to merge 4 commits intoLyoHUB:mainfrom
Conversation
CI/CD Workflows: - Update GitHub Actions workflows (tests.yml, pr-tests.yml, slow-tests.yml) with conda environment setup, caching, and parallel test execution Build/Config: - Add pytest.ini with markers, parallel execution, and strict mode - Add requirements.txt and requirements-dev.txt - Add run_local_ci.sh for local CI reproduction - Update .gitignore for coverage artifacts and test data Test Infrastructure: - Update conftest.py with shared fixtures - Add 8 new test modules: test_calculators, test_calc_unknownRp_coverage, test_coverage_gaps, test_opt_Pch_Tsh_coverage, test_opt_Pch_coverage, test_optimizer, test_regression, test_web_interface - Fix existing tests: test_calc_knownRp, test_freezing, test_functions, test_main, test_opt_Pch, test_opt_Pch_Tsh, test_opt_Tsh Source Bugfixes: - Fix mTorr conversion in calc_knownRp.py - Fix fill_output indexing in functions.py - Fix hasattr check in opt_Tsh.py - Clean up comments in constant.py, opt_Pch_Tsh.py Test Data: - Add reference_optimizer.csv for regression testing
bernalde
left a comment
There was a problem hiding this comment.
Comprehensive PR Review: CI/CD infrastructure and testing improvements
I tested the full suite locally (212 passed, 1 failed, 3 skipped) on Python 3.13.5 with NumPy 2.2. The one failure (test_opt_pch_reference) is a floating-point tolerance issue addressed in the inline comments below.
Overall this is a strong contribution that adds significant test infrastructure. The issues below range from bugs to minor style suggestions.
Summary of findings
Dependencies (2)
pytest-mockis missing fromrequirements-dev.txt-- multiple tests use themockerfixture and will fail on a clean installnp.trapezoidrequires NumPy >= 2.0, butrequirements.txtallowsnumpy>=1.24.0
Bugs in tests (3)
3. Several assertions compare column 6 (percent dried, 0-100 scale) against 0.95/0.99 instead of 95.0/99.0, making them trivially true (test_optimizer.py:268, test_opt_Pch_Tsh_coverage.py:487,508)
4. test_opt_pch_reference tolerance of 1e-6 hr is too tight for optimizer comparison -- fails on current code
5. test_higher_pressure_dries_faster in test_calculators.py computes time_low/time_high but never compares them
Bugs in scripts (1)
6. run_local_ci.sh: set -e makes the failure branch (lines 61-69) unreachable dead code
CI/CD improvements (4)
7. Redundant triple pip install in all CI workflows -- should be a single pip install -e ".[dev]"
8. Notebook tests excluded from PR CI but included in main-branch CI (tests.yml) -- a regression gap
9. Stale "PR #8" comment in pyomo test sections
10. run_local_ci.sh uses grep -oP (GNU-only; fails on macOS)
Code quality (5)
11. calc_knownRp.py should use constant.Torr_to_mTorr instead of hardcoded 1000.0
12. Duplicate Ineq_Constraints tests in both test_functions.py and test_coverage_gaps.py
13. Several copy-pasted docstrings that don't match the actual test (noted by @parkyr in prior review as well)
14. Relative paths for test data files (fragile when running from non-root directories; conftest.py already provides a reference_data_path fixture)
15. pytest.ini minversion = 3.8 is too low -- should match pytest>=7.4.0 from requirements-dev.txt
16. Leftover debug print statement in test_functions.py line 502 -- should be removed
See inline comments for details.
bernalde
left a comment
There was a problem hiding this comment.
Review Summary
I ran the full test suite (excluding notebook and pyomo markers) in the lyopronto conda environment with Python 3.13: 213 passed, 3 skipped, 0 failures — consistent with the PR description. However, several test assertions are vacuously true and would not catch real regressions (details in inline comments).
Source bugfixes assessment
calc_knownRp.py:66— mTorr conversion: Correct fix. The early-return path was missing the Torr→mTorr multiplication that the normal output path applies viaPch * constant.Torr_to_mTorrincalc_step(functions.py:389).functions.py—RampInterpolatordt_setpt indexing: Correct fix. Thehas_init/dt_idxlogic properly distinguishes the init vs. no-init paths, and clampingholdtimeto 0 is better than the previousramptime > holdtimecheck.functions.py—fill_outputindexing: Correct fix.interp_points[mask, :][0]correctly selects the first matching row.opt_Pch_Tsh.pyandopt_Tsh.py: Comment-style changes only. No functional change.
High-priority items
| Priority | File | Issue |
|---|---|---|
| Bug | test_regression.py |
>= 0.99 on percent column (0-100 scale) should be >= 99.0 |
| Config | pytest.ini + pyproject.toml |
Dual pytest config; pytest.ini silently overrides pyproject.toml |
| CI | All workflows | Triple pip install should be one pip install -e '.[dev]' |
| Bug | test_opt_Pch.py:68 |
Equipment capability formula inconsistent with test_opt_Pch_coverage.py:180 |
| Tests | test_optimizer.py / test_opt_Tsh.py |
Near-identical duplicate files |
Issues outside diff context (cannot attach inline)
tests/test_functions.py:317 — Duplicate test class:
The TestIneqConstraints class (lines 317-381) is an exact duplicate of TestFunctionsCoverageGaps in test_coverage_gaps.py lines 10-77. Same function, same parameter values, same assertions. One should be removed.
tests/test_opt_Pch.py:68 — Equipment capability formula:
The check computes actual_cap = eq_cap['a'] + eq_cap['b'] * Pch (total capacity) and compares against per-vial dmdt. But test_opt_Pch_coverage.py:180 divides by nVial. Looking at Ineq_Constraints in functions.py (a + b*Pch - nVial*dmdt), the per-vial capacity is (a + b*Pch) / nVial, so test_opt_Pch_coverage.py is correct and this one is wrong. Same inconsistency in test_opt_Pch_Tsh.py:128 vs test_opt_Pch_Tsh_coverage.py:184.
Test results
213 passed, 3 skipped, 21 deselected, 2067 warnings in 428s
The 2067 warnings are almost entirely calc_unknownRp.py:71 "No sublimation" warnings flooding during test_unknown_rp_different_initial_pressure. The --disable-warnings in pytest.ini masks this in normal runs.
Skipped tests:
test_design_space_shelf_temp_ramp_down— explicit@pytest.mark.skiptest_optimizer_example_script_runs(x2) — skipped by condition
(Please disregard the "Test comment N" reviews above — those were individual validation attempts before the final submission.)
Source fixes: - calc_knownRp.py: use constant.Torr_to_mTorr instead of literal 1000.0 in early-return path, with regression test Test fixes: - Fix percent-dried scale bug: 0.99 -> 99.0 across test_regression.py (5), test_optimizer.py (1), test_opt_Pch_Tsh_coverage.py (2+2), test_opt_Pch_coverage.py (2) - test_coverage_gaps.py: remove duplicate TestIneqConstraints class - test_functions.py: add value assertions to TestIneqConstraints - test_regression.py: skip column 0 (time) in stability test - test_calculators.py: add meaningful assertion in test_higher_pressure_dries_faster - test_web_interface.py, test_calc_unknownRp_coverage.py: use reference_data_path fixture instead of relative paths - test_calc_knownRp.py: add regression test for mTorr conversion Config fixes: - pyproject.toml: remove duplicate [tool.pytest.ini_options] (pytest.ini takes precedence) - pytest.ini: update minversion from 3.8 to 7.4 - requirements-dev.txt: add pytest-mock (matches pyproject.toml [dev]) CI fixes: - pr-tests.yml, tests.yml, slow-tests.yml: consolidate triple pip install into single pip install -e ".[dev]" - Remove hardcoded PR #8 reference from pyomo fallback messages Script fixes: - run_local_ci.sh: remove emojis, replace grep -oP with portable -oE, remove set -e/dead-code if-block, remove hardcoded -n 8, use pip install -e ".[dev]" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ebase Standardize documentation convention for units: use square brackets (e.g., [cm], [Torr], [degC]) instead of bare units or "in unit" style. This follows the convention already established in opt_Tsh.py, opt_Pch_Tsh.py, and constant.py. Updated files: functions.py, freezing.py, calc_knownRp.py, calc_unknownRp.py, opt_Pch.py, design_space.py, high_level.py, and corresponding test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Bump numpy lower bound to >=2.0.0 (np.trapezoid requires NumPy 2.0+) - Update Python requirement to >=3.9 (NumPy 2.0 minimum) - Remove Python 3.8 classifier - Relax flaky optimizer test tolerance from 1e-6 to 1e-3 hr - Fix copy-paste docstring in test_opt_Tsh.py test_short_time Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
This PR adds CI/CD infrastructure and comprehensive testing to LyoPRONTO. It is a cleaned-up replacement for PR #6, trimmed to only CI/CD, testing, and essential source bugfixes (~30 files instead of ~141).
Changes
CI/CD Workflows (3 modified)
tests.yml,pr-tests.yml,slow-tests.yml) with conda environment setup, caching, and parallel test executionBuild/Config (4 new, 1 modified)
pytest.ini— markers, parallel execution, strict moderequirements.txt/requirements-dev.txt— dependency filesrun_local_ci.sh— local CI reproduction script.gitignore— coverage artifacts and test data exclusionsTest Infrastructure (8 new, 8 modified)
test_calculators,test_calc_unknownRp_coverage,test_coverage_gaps,test_opt_Pch_Tsh_coverage,test_opt_Pch_coverage,test_optimizer,test_regression,test_web_interfaceconftest.pywith shared fixturesSource Bugfixes (5 modified)
calc_knownRp.pyfill_outputindexing infunctions.pyhasattrcheck inopt_Tsh.pyconstant.py,opt_Pch_Tsh.pyTest Data (1 new)
reference_optimizer.csvfor regression testingTest Results
lyoprontoconda environment with Python 3.13Relation to PR #6
This replaces PR #6 which had grown to 141 files including benchmarks, documentation, examples, AI config, and generated artifacts. This PR contains only the CI/CD and testing subset. The excluded content (docs, benchmarks, examples) will be submitted in separate focused PRs.