Skip to content

Conversation

@mhabedan
Copy link
Contributor

Hi @GraemeWatt,

Is there a specific reason why Client.find() returns a string if the ids keyword is given? I see why that is helpful when calling find from the command line but if using the python interface, it seems unnecessary. In particular, if the list entries are first joined into a string by ' '.join(find_results) only to be broken up into a list again in user's scripts. (I have one where exactly this is the case.)

Insofar, if there are no principal objections, this PR resolves the issue by adding a format argument to Client.find(). To leave the default behaviour of find() unchanged, the default format remains str. It can also be set to list, set or tuple now, though, if desired.

Matching tests have been added, the readme updated and the version bumped.

@GraemeWatt GraemeWatt requested a review from Copilot October 21, 2025 12:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a format parameter to the Client.find() method, allowing users to specify the return type (str, list, set, or tuple) when retrieving IDs. Previously, IDs were always returned as a space-separated string, requiring users to manually split them into lists. The default behavior remains unchanged (returning a string) for backward compatibility.

Key Changes:

  • Added format parameter to find() method with support for str, list, set, and tuple return types
  • Updated _build_urls() to accept strings in addition to lists/tuples for id_list
  • Bumped version from 0.3.0 to 0.3.1

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
hepdata_cli/api.py Added format parameter to find() method with logic to return results in different formats; updated _build_urls() to handle string input; fixed typo in assertion message
tests/test_search.py Updated test parameters and assertions to validate the new format parameter functionality
tests/test_download.py Added tests for different input formats (str, list, set, tuple) and new integration test combining find() with download()
README.md Updated documentation to explain the new format parameter and showed example usage with format=list
hepdata_cli/version.py Bumped version to 0.3.1

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@GraemeWatt GraemeWatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I agree that it is useful to add the format option without changing the default behaviour, which comes from the arxiv-cli package that originally inspired hepdata-cli. However, the arxiv-cli package has not been maintained for 7 years and is now difficult to install. In a recent commit 3e8a9df I replaced its use in Example 6 with arxiv.py, but then I had to insert an additional line id_list = id_list.split() for it to work. The new format option makes it simpler. I'd be happy to merge after you address my two comments (and the two comments from Copilot).

elif format in (set, tuple):
return format(find_results)
else:
raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not covered by the tests, resulting in a -0.4% decrease in test coverage for the PR. Please add a test that raises the TypeError exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Fixed in 3de6810.

from click.testing import CliRunner

from hepdata_cli.api import Client
from hepdata_cli.api import Client, MAX_MATCHES, MATCHES_PER_PAGE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why import MAX_MATCHES and MATCHES_PER_PAGE here? They're not used in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, that was left over from an earlier version of the tests. Removed now.

@GraemeWatt
Copy link
Member

@codecov-ai-reviewer review

@codecov-ai

This comment has been minimized.

Comment on lines 88 to 89
else:
raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in error message: 'specfied' should be 'specified'. This is a user-facing error message that will appear unprofessional with the typo.

Did we get this right? 👍 / 👎 to inform future reviews.

mhabedan and others added 3 commits January 22, 2026 17:01
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@mhabedan
Copy link
Contributor Author

Sorry for the long wait. I think I have addressed all issues now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants