-
Notifications
You must be signed in to change notification settings - Fork 64
Remove the InSAR workflow intermediate files #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…into rm_scratch
|
Thanks @xhuang-jpl for this PR. Would you please remind me how this PR reduces storage requirements when running InSAR? and would you please document your tests? |
Tyler-g-hudson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A couple of small nitpicks, nothing that should block merging
| t_elapsed = time.time() - t_all | ||
| info_channel.log(f"successfully ran prepare_insar_hdf5 in {t_elapsed:.3f} seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? These lines seem unrelated to the PR
oberonia78
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xhuang-jpl , but I found a bug in the code. Would you be able to fix it?
| for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']: | ||
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") | ||
| if workflow_scratch_path.exists() and intermediate_files_removal_flag: | ||
| shutil.rmtree(workflow_scratch_path) | ||
| info_channel.log(f"removed the {workflow_scratch_path} folder") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']: | |
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") | |
| if workflow_scratch_path.exists() and intermediate_files_removal_flag: | |
| shutil.rmtree(workflow_scratch_path) | |
| info_channel.log(f"removed the {workflow_scratch_path} folder") | |
| for workflow_name in ['rubbersheet_offsets', 'geo2rdr']: | |
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") | |
| if workflow_scratch_path.exists() and intermediate_files_removal_flag: | |
| shutil.rmtree(workflow_scratch_path) | |
| info_channel.log(f"removed the {workflow_scratch_path} folder") |
The ionosphere directory need to be kept until geocode when the ionosphere method is main_diff_ms_band, because the ionosphere in scratch directory is geocoded and store in GUNW.
FYI, I've got error
Traceback (most recent call last):
File "/scratch/jungkyoj/tool/isce3_cuda/src/isce/python/packages/nisar/workflows/insar.py", line 188, in <module>
run(insar_runcfg.cfg, out_paths, persist.run_steps)
File "/scratch/jungkyoj/tool/isce3_cuda/src/isce/python/packages/nisar/workflows/insar.py", line 131, in run
geocode_insar.run(cfg, out_paths['RUNW'], out_paths['GUNW'], InputProduct.RUNW)
File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 67, in run
cpu_run(cfg, input_hdf5, output_hdf5, input_product_type)
File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 691, in cpu_run
cpu_geocode_rasters(geocode_obj, geo_datasets, desired,
File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 501, in cpu_geocode_rasters
get_raster_lists(geo_datasets, desired, freq, pol_list, input_hdf5,
File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 444, in get_raster_lists
raster, path = get_ds_input_output(
^^^^^^^^^^^^^^^^^^^^
File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 173, in get_ds_input_output
input_raster = isce3.io.Raster(input_raster_str)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Error in file /scratch/jungkyoj/tool/isce3_cuda/src/isce/cxx/isce3/io/Raster.cpp, line 28, function isce3::io::Raster::Raster(const std::string&, GDALAccess): failed to create GDAL dataset from file 'HDF5:scratch_dir3/ionosphere/main_diff_ms_band/RUNW.h5://science/LSAR/RUNW/swaths/frequencyB/interferogram/HH/ionospherePhaseScreen'
| baseline.run(cfg, out_paths) | ||
|
|
||
| # Remove the 'bandpass','baseline' scratch folders | ||
| for workflow_name in ['bandpass','baseline']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for workflow_name in ['bandpass','baseline']: | |
| for workflow_name in ['bandpass', 'baseline']: |
nit
|
|
||
| # Remove the 'bandpass','baseline' scratch folders | ||
| for workflow_name in ['bandpass','baseline']: | ||
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") | |
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") |
nitpick
| baseline.run(cfg, out_paths) | ||
|
|
||
| # Remove the 'bandpass','baseline' scratch folders | ||
| for workflow_name in ['bandpass','baseline']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for workflow_name in ['bandpass','baseline']: | |
| for workflow_name in ['bandpass', 'baseline', 'ionosphere']: |
| troposphere.run(cfg, out_paths['GUNW']) | ||
|
|
||
| # Remove the troposhere scratch folder | ||
| tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files") | |
| tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files") |
nitpick
| geocode_insar.run(cfg, out_paths['ROFF'], out_paths['GOFF'], InputProduct.ROFF) | ||
|
|
||
| # Remove the geocode scratch folder | ||
| geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections") | |
| geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections") |
nitpick
|
|
||
| # Remove the 'rubbersheet_offsets','ionosphere', 'geo2rdr' scratch folders | ||
| for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']: | ||
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") | |
| workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}") |
nitpick
| unwrap.run(cfg, out_paths['RIFG'], out_paths['RUNW']) | ||
|
|
||
| # Remove the 'fine_resample_slc','crossmul', 'coarse_resample_slc', 'unwrap' scratch folders | ||
| for workflow_name in ['fine_resample_slc','coarse_resample_slc', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for workflow_name in ['fine_resample_slc','coarse_resample_slc', | |
| for workflow_name in ['fine_resample_slc', 'coarse_resample_slc', |
nitpick
| resample_slc_v2.run(cfg, 'fine') | ||
|
|
||
| # Remove the coarse resample scratch folder | ||
| coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc") | |
| coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc") |
nitpick
|
Thank you @Tyler-g-hudson and @oberonia78 , I have addressed your comments. Please take one more look. |
|
Still looks good to me, pending testing and final approval by @oberonia78 |
|
@oberonia78 please take another look at this. |
|
@oberonia78 If you are ok with this change, would you mind approving it? Thanks |
oberonia78
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR is to remove the InSAR workflow intermediate files under the scratch folder. The outputs and dependency of each module can be found here.