Skip to content

Conversation

@francisbarton
Copy link
Collaborator

Closes #35 and #34

Allowing the file argument to read_azure_parquet() etc to be a partial match was just going to cause trouble. This removes this option and ensures that using a full file path (as returned in the "name" column of AzureStor::list_blobs() for example) should always work, as long the file extension is as expected.

@francisbarton francisbarton linked an issue Sep 3, 2025 that may be closed by this pull request
@francisbarton francisbarton self-assigned this Sep 3, 2025
@francisbarton francisbarton marked this pull request as draft September 4, 2025 13:00
@francisbarton francisbarton marked this pull request as ready for review September 4, 2025 17:06
#' @param ext The standard file extension for the file type, e.g. "json"
#' @keywords internal
check_blob_exists <- function(container, file, file_ext, info, path = "") {
check_blob_exists <- function(container, file, ext, info, path) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the need for separating file and ext? does this all work if you have a file with no extension? example in this repo: ./LICENSE

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. Currently this function is only designed to work within the four main read functions (for parquet, json, csv and rds), which should all have standard file extensions. So the case of no extension shouldn't arise. The value of this approach is that the user can just supply file = "default" instead of file = "default.parquet" to read_azure_parquet() if they like, which is going to be useful in reskit I think. But perhaps I am over-complicating things and risking problems down the line due to overly tangly code.

Copy link
Collaborator Author

@francisbarton francisbarton Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking to push this through as it is, but your point is noted and maybe I will some day revert to just using the full filename as the argument (this can be easily handled in the read_azure_* functions).

❗Try to surface error message from Azure in list_container_names()
@francisbarton francisbarton merged commit ccbda5e into main Sep 11, 2025
1 check passed
@francisbarton francisbarton deleted the issue35 branch September 11, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trying to enable filename matching with regex in check_blob_exists() is causing problems download_azure_blob() doesn't really need to exist

3 participants