Skip to content

Conversation

@xhuang-jpl
Copy link
Contributor

This PR is to remove the InSAR workflow intermediate files under the scratch folder. The outputs and dependency of each module can be found here.

Xiaodong Huang added 22 commits September 19, 2023 20:40
@hfattahi
Copy link
Contributor

hfattahi commented Aug 7, 2025

Thanks @xhuang-jpl for this PR. Would you please remind me how this PR reduces storage requirements when running InSAR? and would you please document your tests?

@hfattahi hfattahi added this to the R05.00.1 milestone Oct 30, 2025
Copy link
Contributor

@Tyler-g-hudson Tyler-g-hudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A couple of small nitpicks, nothing that should block merging

Comment on lines +68 to +69
t_elapsed = time.time() - t_all
info_channel.log(f"successfully ran prepare_insar_hdf5 in {t_elapsed:.3f} seconds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? These lines seem unrelated to the PR

Copy link
Contributor

@oberonia78 oberonia78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xhuang-jpl , but I found a bug in the code. Would you be able to fix it?

Comment on lines 120 to 124
for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']:
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
if workflow_scratch_path.exists() and intermediate_files_removal_flag:
shutil.rmtree(workflow_scratch_path)
info_channel.log(f"removed the {workflow_scratch_path} folder")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']:
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
if workflow_scratch_path.exists() and intermediate_files_removal_flag:
shutil.rmtree(workflow_scratch_path)
info_channel.log(f"removed the {workflow_scratch_path} folder")
for workflow_name in ['rubbersheet_offsets', 'geo2rdr']:
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
if workflow_scratch_path.exists() and intermediate_files_removal_flag:
shutil.rmtree(workflow_scratch_path)
info_channel.log(f"removed the {workflow_scratch_path} folder")

The ionosphere directory need to be kept until geocode when the ionosphere method is main_diff_ms_band, because the ionosphere in scratch directory is geocoded and store in GUNW.

FYI, I've got error

Traceback (most recent call last):
  File "/scratch/jungkyoj/tool/isce3_cuda/src/isce/python/packages/nisar/workflows/insar.py", line 188, in <module>
    run(insar_runcfg.cfg, out_paths, persist.run_steps)
  File "/scratch/jungkyoj/tool/isce3_cuda/src/isce/python/packages/nisar/workflows/insar.py", line 131, in run
    geocode_insar.run(cfg, out_paths['RUNW'], out_paths['GUNW'], InputProduct.RUNW)
  File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 67, in run
    cpu_run(cfg, input_hdf5, output_hdf5, input_product_type)
  File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 691, in cpu_run
    cpu_geocode_rasters(geocode_obj, geo_datasets, desired,
  File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 501, in cpu_geocode_rasters
    get_raster_lists(geo_datasets, desired, freq, pol_list, input_hdf5,
  File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 444, in get_raster_lists
    raster, path = get_ds_input_output(
                   ^^^^^^^^^^^^^^^^^^^^
  File "/scratch/jungkyoj/tool/isce3_cuda/install/packages/nisar/workflows/geocode_insar.py", line 173, in get_ds_input_output
    input_raster = isce3.io.Raster(input_raster_str)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Error in file /scratch/jungkyoj/tool/isce3_cuda/src/isce/cxx/isce3/io/Raster.cpp, line 28, function isce3::io::Raster::Raster(const std::string&, GDALAccess): failed to create GDAL dataset from file 'HDF5:scratch_dir3/ionosphere/main_diff_ms_band/RUNW.h5://science/LSAR/RUNW/swaths/frequencyB/interferogram/HH/ionospherePhaseScreen'

baseline.run(cfg, out_paths)

# Remove the 'bandpass','baseline' scratch folders
for workflow_name in ['bandpass','baseline']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for workflow_name in ['bandpass','baseline']:
for workflow_name in ['bandpass', 'baseline']:

nit


# Remove the 'bandpass','baseline' scratch folders
for workflow_name in ['bandpass','baseline']:
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")

nitpick

baseline.run(cfg, out_paths)

# Remove the 'bandpass','baseline' scratch folders
for workflow_name in ['bandpass','baseline']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for workflow_name in ['bandpass','baseline']:
for workflow_name in ['bandpass', 'baseline', 'ionosphere']:

troposphere.run(cfg, out_paths['GUNW'])

# Remove the troposhere scratch folder
tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files")
tropo_scratch_path = pathlib.Path(f"{scratch_path}/weather_model_files")

nitpick

geocode_insar.run(cfg, out_paths['ROFF'], out_paths['GOFF'], InputProduct.ROFF)

# Remove the geocode scratch folder
geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections")
geocode_scratch_path = pathlib.Path(f"{scratch_path}/geocode_corrections")

nitpick


# Remove the 'rubbersheet_offsets','ionosphere', 'geo2rdr' scratch folders
for workflow_name in ['rubbersheet_offsets','ionosphere', 'geo2rdr']:
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")
workflow_scratch_path = pathlib.Path(f"{scratch_path}/{workflow_name}")

nitpick

unwrap.run(cfg, out_paths['RIFG'], out_paths['RUNW'])

# Remove the 'fine_resample_slc','crossmul', 'coarse_resample_slc', 'unwrap' scratch folders
for workflow_name in ['fine_resample_slc','coarse_resample_slc',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for workflow_name in ['fine_resample_slc','coarse_resample_slc',
for workflow_name in ['fine_resample_slc', 'coarse_resample_slc',

nitpick

resample_slc_v2.run(cfg, 'fine')

# Remove the coarse resample scratch folder
coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc")
coarse_resample_scratch_path = pathlib.Path(f"{scratch_path}/coarse_resample_slc")

nitpick

Xiaodong Huang added 2 commits November 3, 2025 18:02
@xhuang-jpl
Copy link
Contributor Author

Thank you @Tyler-g-hudson and @oberonia78 , I have addressed your comments. Please take one more look.

@xhuang-jpl xhuang-jpl removed the request for review from vbrancat November 3, 2025 18:06
@Tyler-g-hudson
Copy link
Contributor

Still looks good to me, pending testing and final approval by @oberonia78

@hfattahi hfattahi modified the milestones: R05.00.1, R05.00.3 Nov 11, 2025
@hfattahi
Copy link
Contributor

@oberonia78 please take another look at this.

@hfattahi hfattahi removed this from the R05.00.3 milestone Nov 14, 2025
@xhuang-jpl
Copy link
Contributor Author

@oberonia78 If you are ok with this change, would you mind approving it? Thanks

Copy link
Contributor

@oberonia78 oberonia78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hfattahi hfattahi added this to the R05.01.0 milestone Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants