26 create data pipeline for standard wrf hydro scenario creation#362
26 create data pipeline for standard wrf hydro scenario creation#362Martin20494 wants to merge 10 commits intomasterfrom
Conversation
- Add more packages into the environment.yml - Add necessary python modules from WRF-Hydro for preprocessing step - Create a script to run all python modules
- Add three files including three classes: one is to generate domain files, one is to generate routing stack, and the other one is to generate forcing data from ERA5. The final class to run WRF-Hydro simulation will be added later.
- Add more files: * WRFHydroSimulation.py is to run simulations * simulatedResultManipulation.py is to flip and reproject streamflow. This file is potentially for manipulating more results * RUN_all.py is to run whole process to run WRF-Hydro simulation
| # A Window path where geogrid folder, create_soilproperties.R, create_wrfinput.R, | ||
| # CHANPARM.TBL, GENPARM.TBL, HYDRO.TBL, MPTABLE.TBL, and SOILPARM.TBL are located | ||
| # [CHANGE INTO DIGITAL TWIN PATH - THIS SHOULD BE FIXED] | ||
| self.window_geogrid_path = r"C:\Users\mng42\wrf_wps\wrf_hydro_inputs\simulation_mataura_50m_002\geogrid" |
There was a problem hiding this comment.
This folder should be accessible to the digital twin in some way
- run_all.py has not been adjusted yet - Silent pylint (needs checking): * wrfhydro_simulation.py: #pylint: disable=too-many-instance-attributes, invalid-name * domain_files.py: #pylint: disable=too-many-instance-attributes * forcing_dataset.py: #pylint: disable=differing-param-doc, differing-type-doc, wrong-import-position - forcing_dataset.py needs to fix longitude - Add more packages in enviroment
wrfhydro/example/forcing_dataset.py
Outdated
| os.environ["ESMFMKFILE"] = r"C:\Users\mng42\AppData\Local\anaconda3\envs\wrfhydro_005\Library\lib\esmf.mk" | ||
| from datetime import datetime, timedelta |
There was a problem hiding this comment.
Luke to come back and help remove this environ check
| Definition: | ||
| Init function to state common arguments | ||
| References: |
There was a problem hiding this comment.
Match the style from the main codebase (numpy style docs)
LukeParky
left a comment
There was a problem hiding this comment.
Really good code, I think we should create another issue about using WSL/Linux in a more native way. My main points of improvement I want to see before merging are around using the type system more where it can benefit you and help others understand.
Also there are still linting errors and differences between the documentation style.
Would you like to handle the documentation style in a different PR?
Feel free to comment on my suggestions if you need more info or don't agree with any.
|
|
||
| return selected_as_rectangle_2193 | ||
|
|
||
| return selected_as_rectangle_2193 |
There was a problem hiding this comment.
Looks like this might be an accidental change, set it back.
Adding a suggestion that you can click on to revert it if it's accidental.
| return selected_as_rectangle_2193 | |
| return selected_as_rectangle_2193 |
| - conda-forge | ||
| - defaults | ||
| dependencies: | ||
| - python==3.11 # 11 for StrEnum |
There was a problem hiding this comment.
Lots of dependencies, so I'll come back and check these. Do you think this can be merged with the main environment.yml so we only have one?
| - cdsapi # For WRF-Hydro - download ERA5 files | ||
| - xesmf # For WRF-Hydro - regrid | ||
| - esmpy # For WRF-Hydro - regrid | ||
| - esmf # For WRF-Hydro - regrid | ||
| - xarray # For WRF-Hydro - handle geospatial array data | ||
| - netCDF4 # For WRF-Hydro - handle geospatial array data | ||
| - rioxarray # For WRF-Hydro - handle geospatial array data | ||
| - matplotlib # For WRF-Hydro - plot results (MAYBE) | ||
| - numpy # For WRF-Hydro - handle array data | ||
| - pandas=1.5.3 # For WRF-Hydro - handle tabular data | ||
| - hdf5 # For WRF-Hydro - handle geospatial array data | ||
| - gdal=3.6.2 # For WRF-Hydro - handle geospatial array data |
There was a problem hiding this comment.
note for self: come back and check dependencies
There was a problem hiding this comment.
Do we need an example directory? Can this all be moved into other folders depending on their use? And then it will evolve to be the final production code
| @@ -0,0 +1,367 @@ | |||
| """Runs to generate domain files which are inputs for WRF-Hydro""" | |||
| # pylint: disable=too-many-instance-attributes | |||
There was a problem hiding this comment.
This should only be used on the class it's needed for, not the whole file
| CHRTOUT_DOMAIN: int = 1, | ||
| CHANOBS_DOMAIN: int = 1, | ||
| CHRTOUT_GRID: int = 1, | ||
| LSMOUT_DOMAIN: int = 1, | ||
| RTOUT_DOMAIN: int = 1, | ||
| output_gw: int = 1, | ||
| outlake: int = 0, | ||
| frxst_pts_out: int = 1 | ||
| ) -> None: |
There was a problem hiding this comment.
Can these parameters be bool, so that we can use the rich python typing structure, and only convert to int when you're writing the file?
| content = f"""&HYDRO_nlist | ||
| !!!! ---------------------- SYSTEM COUPLING ----------------------- !!!! | ||
|
|
||
| ! Specify what is being coupled: 1=HRLDAS (offline Noah-LSM), 2=WRF, 3=NASA/LIS, 4=CLM | ||
| sys_cpl = 1 | ||
|
|
||
| !!!! ------------------- MODEL INPUT DATA FILES ------------------- !!!! | ||
|
|
There was a problem hiding this comment.
I think a lot of these large content files could be stored as string template files that are read from disk, what are your thoughts?
| for each_file_path in selected_all_files_paths: | ||
| shutil.copy2( | ||
| str(Path(each_file_path)), | ||
| str(Path(self.domain_path) / Path(each_file_path).name) |
There was a problem hiding this comment.
See recommendations earlier about using Path without converting to str
| Arguments: | ||
| Already defined above. |
There was a problem hiding this comment.
| Arguments: | |
| Already defined above. |
No use having that line
There was a problem hiding this comment.
Looks repeated in other places too
| Already defined above. | ||
| """ | ||
| # Define command notes and log name | ||
| command_notes = r"mpirun -np 2 ./wrf_hydro_NoahMP.exe >> run.log 2>&1" |
There was a problem hiding this comment.
Have you experimented with higher values of -np? I think it is the multithreading number of processes variable
DESCRIPTION OF PR:
Develop automatic process to simulate streamflow data using WRF-Hydro open source.
Developer Checklist
Reviewer Checklist