This package is used to access the datalake store itself. Looking at the reference, we note that this is using WebHDFS.
Just to note that the goal of this package is to get things working for Azure Datalake Store. If we end up being able to support WebHDFS, that will be a bonus.
WebHDFS seems to use a lot of redirects. However, this Microsoft blog suggests that we may not have to do that. If so, I don’t know if this would be a “feature” of Azure or a “feature” of httr, or maybe curl.
adls
S3 object to hold the base-url and the token done
Make a directory done
List a directory done
Rename a file/folder in a directory done
Delete a file/directory done
Upload a file to a directory done
Read a file from a directory done
Append to a file in a directory done
Get file status (file/directory) (consider sending back as data-frame and reusing format from list_status)
Get contents summary (directory)
Concatenate files
For the file uploads, it may be easiest to provide a filepath (even a temporary one). This makes us dependent on a filesystem, whereas I would rather not be (shinyapps.io, for example).
Following readr, file can be:
upload_file()
)Near future:
Farther future, handle connections and files that begin with http://
, etc.
How about adls_is_empty()
to see if a path returns anything?
Type-stability for functions that return file-status objects. Thinking of list_status
and get_file_status
. Return empty data-frames or NULL? NULL.
AzureDatalakeStore is not yet available on CRAN. You may install from GitHub:
# install.packages("devtools")
devtools::install_github("ijlyttle/AzureDatalakeStore")
If you encounter a clear bug, please file a minimal reproducible example on github.
This package draws inspiration from Forest Fang’s rwebhdfs package.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.