GEDI Retrieval Code by ryanmcarlson · Pull Request #15 · collaborative-earth/forest

ryanmcarlson · 2021-02-08T19:29:38Z

No description provided.

This program can be used to download GEDI files using an text file of urls provided by EarthData Search. Combined with the gediFinder.py program, one should be able to gather data from a large bounding box using Earthdata search without a lot of computer storage space and in a timely manner.

crich011

I made a handful of comments, one structural one about the order of downloading vs. checking output format but mostly about docs.

I think this all looks good and after you take a look at those, along with adding trailing spaces for some of the argparse help strings (see, e.g. L355 of process_l2a, which should get a space at the end of that string), and make any modifications to docstrings, I think we should merge it and get started on writing tests to port it to the earthshot repo.

crich011 · 2021-02-09T18:12:21Z

gedi/process_l2a.py

+    a = np.sort(a, axis =1)
+    count = (~np.isnan(a)).sum(axis=1) # count number of non-nans in row
+    groups = np.unique(count)     # returns sorted unique values
+    groups = groups[groups > 0]   # only returns groups with at least 1 non-nan value\n",


Suggested change

groups = groups[groups > 0] # only returns groups with at least 1 non-nan value\n",

groups = groups[groups > 0] # only returns groups with at least 1 non-nan value

crich011 · 2021-02-09T18:17:22Z

gedi/process_l2a.py

-
    Parameters
    ----------
    file_dir : str


The docstring here can be updated to remove file_dir kwarg since the function just takes full paths, instead of a path and filenames, now.

crich011 · 2021-02-09T18:26:50Z

gedi/process_l2a.py

+        if args.filetype.lower() == "csv":
+            filename = os.path.join(args.dir, args.outfile + ".csv")
+            print(f'Writing to file {filename}')
+            df.to_csv(filename, index=False)
+        elif args.filetype.lower() == "parquet":
+            filename = os.path.join(args.dir, args.outfile + ".parquet.gzip")
+            print(f'Writing to file {filename}')
+            df.to_parquet(filename, compression="gzip")
+        elif args.filetype.lower() == "geojson":
+            filename = os.path.join(args.dir, args.outfile + ".geojson")
+            print(f'Writing to file {filename}')
+            df_to_geojson(df, filename)
+        else:
+            raise ValueError(
+                f"Received unsupported file type {args.filetype}. Please provide one of: csv, parquet, or GeoJSON."
+            )


It might make more sense to do the file extension handling before the downloading and unpacking, just in case a user provides the wrong file extension (or has a typo), to save the risk of losing that downloading and unpacking. Another option would be to store the output of the intermediate steps, but I think just checking up-front probably makes more sense than that to me.

crich011 · 2021-02-09T18:32:56Z

gedi/process_l2a.py

+    None
+    """
+    filepath = os.path.join(dir, "granuleData.zip")
+    r = requests.get(url, stream = True)


I think the way this does the checking and prints a descriptive error to the user is useful, especially since we want this function to return a bool, rather than throw the exception outright.

But, for future reference, a helpful tip that someone taught me: in most cases, using raise_for_status (i.e. r.raise_for_status() in this case) does the sensible thing you'd want when checking a response. It raises an error if the request returns an error or times out, and it returns none otherwise. Again, I think that is not actually what we want here, but it's worth having that in your toolbox for the times where it is what you want to do.

crich011 · 2021-02-09T18:34:17Z

gedi/process_l2a.py

+    Downloads and unpacks the zip file from the provide url
+    Returns
+    -------
+    None


Suggested change

None

bool

True indicates a successful download. False indicates that the download was unsuccessful.

ryanmcarlson and others added 25 commits February 3, 2021 11:40

Update README.md

ffb27e9

Update README.md

d5a6324

Create gediFinder.py

2ac7afc

Optimized _compute_nan_percentile function

183fd13

Deleted unused module import

1f5cf2b

Update gediFinder.py

96a3e52

Update GEDI Finder section of README

1a5323d

Update GEDI Combine header in README

33c16fa

README.md

6bc69ac

README.md

13ca22a

README.md

338974d

README.md

90ac975

Update README.md

bd55b19

Update README.md

a7784a7

Update README.md

f2a91ad

Update README.md

632db4d

gediCombine.py

053bf9c

README.md

ef02e0b

README.md

25cd076

gediCombine.py

727dc7f

process_l2a.py

e154224

FIX: remove process_l2a for fastforward

5591e68

Merge branch 'main' into gediRetrieval

50cdff7

FIX: move gediCombine to process_l2a

3d78c0e

crich011 suggested changes Feb 9, 2021

View reviewed changes

crich011 assigned ryanmcarlson Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GEDI Retrieval Code#15

GEDI Retrieval Code#15
ryanmcarlson wants to merge 25 commits intomainfrom
gediRetrieval

ryanmcarlson commented Feb 8, 2021

Uh oh!

crich011 left a comment

Uh oh!

crich011 Feb 9, 2021

Uh oh!

crich011 Feb 9, 2021

Uh oh!

crich011 Feb 9, 2021

Uh oh!

crich011 Feb 9, 2021

Uh oh!

crich011 Feb 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	groups = groups[groups > 0] # only returns groups with at least 1 non-nan value\n",
	groups = groups[groups > 0] # only returns groups with at least 1 non-nan value

	None
	bool
	True indicates a successful download. False indicates that the download was unsuccessful.

Conversation

ryanmcarlson commented Feb 8, 2021

Uh oh!

crich011 left a comment

Choose a reason for hiding this comment

Uh oh!

crich011 Feb 9, 2021

Choose a reason for hiding this comment

Uh oh!

crich011 Feb 9, 2021

Choose a reason for hiding this comment

Uh oh!

crich011 Feb 9, 2021

Choose a reason for hiding this comment

Uh oh!

crich011 Feb 9, 2021

Choose a reason for hiding this comment

Uh oh!

crich011 Feb 9, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants