Interval POMDPs, POSMG rewards, POSMG constraints#70
Interval POMDPs, POSMG rewards, POSMG constraints#70PurpleDragon64 wants to merge 150 commits intorandriu:masterfrom
Conversation
… initial_memory setting to POSMG; updated POSMG synthesis to display game iterations; added new test POSMG model
TheGreatfpmK
left a comment
There was a problem hiding this comment.
- Remove the pomdp/sketches from models as these are used for policy tree experiments which is not part of this PR.
- Add some tests if possible
| type = line.removeprefix(cls.TYPE_PREFIX).removesuffix('\n') | ||
| return type | ||
| raise ValueError | ||
| if cls.INTERVAL_BEGINNING in line: |
There was a problem hiding this comment.
I don't like that we are going through to whole file if it's not an interval model but I don't know how to make this better
There was a problem hiding this comment.
The problem is that not all transitions in the model have to be specified as intervals for the model to be an interval pomdp. In other words: if there is at least one transition with intervals, it is an interval model. So you cannot be sure, until you check the whole file. See e.g. model models/ipomdp/simple1/sketch.templ where action 1 is not defined with intervals but action 0 is.
| if updated is not None: explicit_quotient = updated | ||
| if not payntbind.synthesis.assertChoiceLabelingIsCanonic(explicit_quotient.nondeterministic_choice_indices,explicit_quotient.choice_labeling,False): | ||
| logger.warning("WARNING: choice labeling for the quotient is not canonic") | ||
| # TEMPORARY FIX |
There was a problem hiding this comment.
Would this be difficult to implement?
paynt/quotient/ipomdp.py
Outdated
| self.ipomdp = ipomdp | ||
| self.specification = specification | ||
|
|
||
| logger.debug(f'ipomdp has {max(self.ipomdp.observations)+1} observations') |
There was a problem hiding this comment.
Why is it max() here?
There was a problem hiding this comment.
The goal is to print the number of distinct observations in the model. It was now improved to not require observations to be continuous sequence of numbers starting with 0.
| synthesis_timer.stop() | ||
| time = synthesis_timer.time | ||
| logger.info(f'synthesis completed, value: {round(value, 6)}, time: {round(time, 2)} s') | ||
| # better summary?? use Statistic class? (specification, game iterations, ...) |
There was a problem hiding this comment.
This would be nice to be more in line with other synthesizers
There was a problem hiding this comment.
Improved by adding statistics for ipomdp
| import paynt.quotient.pomdp | ||
| import paynt.synthesizer.synthesizer_ar | ||
|
|
||
| class SynthesizerPomdpOneByOne(paynt.synthesizer.synthesizer.Synthesizer): |
There was a problem hiding this comment.
Do we need this one-be-one synthesis in master? I would probably remove this, I understand it was used for experiments but this is not something anyone will use in PAYNT
There was a problem hiding this comment.
And if we want to keep it then I would look into how to make use of the one-by-one synthesizer that's already in PAYNT
There was a problem hiding this comment.
Resolved by removing POMDP family related code.
paynt/quotient/ipomdp.py
Outdated
| # the new states will have new action representing combinations of lower and upper bound of the interval | ||
| # new states (and their actions) will be at the end of the matrix | ||
| # IDEA use p1state,p2state,choice(action),destination,transition,probability or originalState,newState,row,column,entry,value? | ||
| # IDEA return just new state count instead of new states |
There was a problem hiding this comment.
Clean up the comments so it's only the important stuff, if something is potential TODO add it to where it belongs
There was a problem hiding this comment.
removed ideas, kept description
| posmg = payntbind.synthesis.posmg_from_smg(smg, observations) | ||
| logger.debug(f'constructed game abstraction having {posmg.nr_states} states and {posmg.nr_choices} choices.') | ||
|
|
||
| return posmg |
There was a problem hiding this comment.
So the whole purpose of this class and all the function inside is to create the POSMG right?
There was a problem hiding this comment.
Yes. And the class also stores the specification.
There was a problem hiding this comment.
Pull Request Overview
This PR adds end-to-end support for interval POMDPs by extending the parser, quotient, POSMG reward handling, and synthesizer, along with corresponding tests.
- Introduce
IpomdpQuotientfor interval POMDP abstraction into POSMG games. - Enhance
PosmgManagerto construct and propagate SMG reward models. - Update DRN parser to recognize and build interval models, and wire through a new
SynthesizerIpomdp. - Add a comprehensive test suite for IPOMDP abstraction and synthesis.
Reviewed Changes
Copilot reviewed 126 out of 136 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/ipomdp/test_ipomdp.py | New tests validating IPOMDP abstraction and solving |
| paynt/quotient/ipomdp.py | Implements IPOMDP-to-POSMG quotient logic |
| paynt/parser/drn_parser.py | Detects IPOMDP sketches and builds interval models |
| payntbind/src/synthesis/posmg/PosmgManager.cpp | Adds constructRewardModel and integrates reward models |
| paynt/synthesizer/synthesizer_ipomdp.py | New synthesizer class for IPOMDP using POSMG abstraction |
| paynt/synthesizer/policy_tree.py | Refactors game-stat logging into log_game_stats |
Comments suppressed due to low confidence (1)
paynt/parser/drn_parser.py:56
- [nitpick] Using
typeshadows the built-in. Consider renaming this variable tomodel_typeor similar.
type = line.removeprefix(cls.TYPE_PREFIX).removesuffix('\n')
paynt/synthesizer/policy_tree.py
Outdated
|
|
||
|
|
||
| def log_game_stats(self, states, game_solver): | ||
| self.stat.iteration_game(states) |
There was a problem hiding this comment.
[nitpick] The game_solver parameter is unused in log_game_stats. Either remove it or use it for logging.
| self.stat.iteration_game(states) | |
| self.stat.iteration_game(states) | |
| logger.info("Game solver stats: solution value = {}, solution state-to-action mapping size = {}".format( | |
| game_solver.solution_value, len(game_solver.solution_state_to_player1_action))) |
There was a problem hiding this comment.
I said this in my review as well. Please resolve!
There was a problem hiding this comment.
Resolved by removing POMDP family related code.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.