Skip to content

Commit 20bd514

Browse files
authored
Merge pull request #139 from IQSS/dev
CRAN v0.3.15
2 parents b11b135 + c885fff commit 20bd514

12 files changed

Lines changed: 61 additions & 58 deletions

File tree

.Rbuildignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,4 @@ tests/.*_ghaction.R
2525
^\.github$
2626
rhub-checks
2727
/Untitled.+\.R$
28+
^CRAN-SUBMISSION$

README.Rmd

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -99,10 +99,10 @@ nlsw_tsv <-
9999
)
100100
```
101101

102-
Now, Dataverse often translates rectangular data into an ingested, or "archival" version, which is application-neutral and easily-readable. `read_dataframe_*()` defaults to taking this ingested version rather than using the original, through the argument `original = FALSE`.
103-
104-
This default is safe because you may not have the proprietary software that was originally used. On the other hand, the data may have lost information in the process of the ingestation.
102+
**The `original` argument:** Dataverse often translates rectangular data into an ingested, or "archival" version, which is application-neutral and easily-readable. `read_dataframe_*()` defaults to taking this ingested version rather than using the original, through the argument `original = FALSE`.
103+
This default is safe because you may not have the proprietary software that was originally used.
105104

105+
On the other hand, the data may have lost information in the process of the ingestion.
106106
Instead, to read the same file but its original version, specify `original = TRUE` and set an `.f` argument. In this case, we know that `nlsw88.tab` is a Stata `.dta` dataset, so we will use the `haven::read_dta` function.
107107

108108
```{r get_dataframe_by_name_original}
@@ -120,7 +120,6 @@ Note that even though the file prefix is ".tab", we use `haven::read_dta`.
120120

121121
Of course, when the dataset is not ingested (such as a Rds file), users would always need to specify an `.f` argument for the specific file.
122122

123-
124123
Note the difference between `nls_tsv` and `nls_original`. `nls_original` preserves the data attributes like value labels, whereas `nls_tsv` has dropped this or left this in file metadata.
125124

126125
```{r}
@@ -132,6 +131,7 @@ attr(nlsw_original$race, "labels") # original dta has value labels
132131
```
133132

134133

134+
**Caching**: When the dataset to be downloaded is large, downloading the dataset from the internet can be time consuming, and users want to run the download only once in a script they run multiple times. As of version 0.3.15, our package will cache the download data if the user specifies which version of the Dataverse dataset they download from. See the `version` argument in the help page.
135135

136136
### Data Upload and Archiving
137137

@@ -208,7 +208,7 @@ Functions related to user management and permissions are currently not exported
208208

209209
Dataverse clients in other programming languages include [pyDataverse](https://pydataverse.readthedocs.io/en/latest/) for Python and the [Java client](https://github.com/IQSS/dataverse-client-java). For more information, see [the Dataverse API page](https://guides.dataverse.org/en/5.5/api/client-libraries.html#r).
210210

211-
Users interested in downloading metadata from archives other than Dataverse may be interested in Kurt Hornik's [OAIHarvester](https://cran.r-project.org/package=OAIHarvester) and Scott Chamberlain's [oai](https://cran.r-project.org/package=oai), which offer metadata download from any web repository that is compliant with the [Open Archives Initiative](https://www.openarchives.org:443/) standards. Additionally, [rdryad](https://cran.r-project.org/package=rdryad) uses OAIHarvester to interface with [Dryad](https://datadryad.org/stash). The [rfigshare](https://cran.r-project.org/package=rfigshare) package works in a similar spirit to **dataverse** with <https://figshare.com/>.
211+
Users interested in downloading metadata from archives other than Dataverse may be interested in Kurt Hornik's [OAIHarvester](https://cran.r-project.org/package=OAIHarvester) and Scott Chamberlain's [oai](https://cran.r-project.org/package=oai), which offer metadata download from any web repository that is compliant with the [Open Archives Initiative](https://www.openarchives.org:443/) standards. Additionally, [rdryad](https://cran.r-project.org/package=rdryad) uses OAIHarvester to interface with [Dryad](https://datadryad.org/). The [rfigshare](https://cran.r-project.org/package=rfigshare) package works in a similar spirit to **dataverse** with <https://figshare.com/>.
212212

213213

214214
### More Information

README.md

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -137,18 +137,17 @@ nlsw_tsv <-
137137
)
138138
```
139139

140-
Now, Dataverse often translates rectangular data into an ingested, or
141-
“archival” version, which is application-neutral and easily-readable.
142-
`read_dataframe_*()` defaults to taking this ingested version rather
143-
than using the original, through the argument `original = FALSE`.
144-
145-
This default is safe because you may not have the proprietary software
146-
that was originally used. On the other hand, the data may have lost
147-
information in the process of the ingestation.
148-
149-
Instead, to read the same file but its original version, specify
150-
`original = TRUE` and set an `.f` argument. In this case, we know that
151-
`nlsw88.tab` is a Stata `.dta` dataset, so we will use the
140+
**The `original` argument:** Dataverse often translates rectangular data
141+
into an ingested, or “archival” version, which is application-neutral
142+
and easily-readable. `read_dataframe_*()` defaults to taking this
143+
ingested version rather than using the original, through the argument
144+
`original = FALSE`. This default is safe because you may not have the
145+
proprietary software that was originally used.
146+
147+
On the other hand, the data may have lost information in the process of
148+
the ingestion. Instead, to read the same file but its original version,
149+
specify `original = TRUE` and set an `.f` argument. In this case, we
150+
know that `nlsw88.tab` is a Stata `.dta` dataset, so we will use the
152151
`haven::read_dta` function.
153152

154153
``` r
@@ -185,6 +184,13 @@ attr(nlsw_original$race, "labels") # original dta has value labels
185184
## white black other
186185
## 1 2 3
187186

187+
**Caching**: When the dataset to be downloaded is large, downloading the
188+
dataset from the internet can be time consuming, and users want to run
189+
the download only once in a script they run multiple times. As of
190+
version 0.3.15, our package will cache the download data if the user
191+
specifies which version of the Dataverse dataset they download from. See
192+
the `version` argument in the help page.
193+
188194
### Data Upload and Archiving
189195

190196
**Note**: *There are known issues to using to dataverse creation and
@@ -288,7 +294,7 @@ offer metadata download from any web repository that is compliant with
288294
the [Open Archives Initiative](https://www.openarchives.org:443/)
289295
standards. Additionally,
290296
[rdryad](https://cran.r-project.org/package=rdryad) uses OAIHarvester to
291-
interface with [Dryad](https://datadryad.org/stash). The
297+
interface with [Dryad](https://datadryad.org/). The
292298
[rfigshare](https://cran.r-project.org/package=rfigshare) package works
293299
in a similar spirit to **dataverse** with <https://figshare.com/>.
294300

inst/constants.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
server: "demo.dataverse.org"
2-
api_token: "15372813-c54f-471f-a3e8-c269ee6a610f"
3-
api_token_expiration: "2025-05-10"
2+
api_token: "e7563e83-1e8c-4ca3-8c01-03e274a8277b"
3+
api_token_expiration: "2026-05-20"
44
api_token_name: "shirokuriwaki"

man-roxygen/version.R

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
11
#' @param version A character specifying a version of the dataset.
22
#' This can be of the form `"1.1"` or `"1"` (where in `"x.y"`, x is a major
3-
#' version and y is an optional minor version), or
4-
#' `":latest"` (the default, the latest published version).
5-
#' We recommend using the number format so that
6-
#' the function stores a cache of the data (See \code{\link{cache_dataset}}).
7-
#' If the user specifies a `key` or `DATAVERSE_KEY` argument, they can access the
8-
#' draft version by `":draft"` (the current draft) or `":latest"` (which will
9-
#' prioritize the draft over the latest published version.
3+
#' version and y is an optional minor version). As of v0.3.14, setting a version
4+
#' in this way will cache the dataset (See example in \code{\link{cache_dataset}})
5+
#' so that it will not re-download the file the second time and read from the cache.
106
#' Finally, set `use_cache = "none"` to not read from the cache and re-download
117
#' afresh even when `version` is provided.
8+
#' If the user specifies a `key` or `DATAVERSE_KEY` argument, they can access the
9+
#' draft version by `":draft"` (the current draft) or `":latest"` (which will
10+
#' prioritize the draft over the latest published version).

man/cache.Rd

Lines changed: 6 additions & 7 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/files.Rd

Lines changed: 6 additions & 7 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/get_dataframe.Rd

Lines changed: 6 additions & 7 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/get_dataset.Rd

Lines changed: 6 additions & 7 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

tests/testthat/tests-dataset_metadata.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ test_that("check versions format", {
2727
"fileAccessRequest", "files", "id", "lastUpdateTime", "latestVersionPublishingState",
2828
"license", "metadataBlocks", "publicationDate", "releaseTime",
2929
"storageIdentifier", "UNF", "versionMinorNumber", "versionNumber",
30-
"versionState")
30+
"versionState", "deaccessionLink")
3131
expect_setequal(names(actual[[1]]), expected_names)
3232
expect_s3_class(actual[[2]], "dataverse_dataset_version")
3333
})

0 commit comments

Comments
 (0)