Greater Goal
I want to extract package names (and more details later) from the package bibliography in R (generated with write_bib()
). In order to create a table with columns for the most relevant information on the packages used in my analyses (e.g. Name of the package, version, maintainer, citation).
In the example entry from the bibliography below, I want to get the following string R-base
"@Manual{R-base, title = {R: A Language and Environment for Statistical Computing}, author = {{R Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2020}, url = {https://www.R-project.org/}, }"
The extraction of a the packagename substring between {
and ,
works
with the regex ""(?<=\\{).*(?=,)"
-> this returns R-base
Current problem
When outside of a loop, the code below results the desired output of R-Base
teststring <- "@Manual{R-base,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2020},
url = {https://www.R-project.org/},
}"
str_extract(teststring,"(?<=\\{).*(?=,)")
However when I try to do the exactly same thing inside of a for loop
I get multiple matches from the str_extract()
function.
bibliography <- write_bib()
for (entry in bibliography[1]){
# currently for testing purposes just for the first entry in bibliography
print(typeof(entry))
print(str_extract(entry,"(?<=\\{).*(?=,)") )
}
[1] "character"
[1] "R-base"
[2] "R: A Language and Environment for Statistical Computing}"
[3] "{R Core Team}}"
[4] "R Foundation for Statistical Computing}"
[5] "Vienna, Austria}"
[6] "2020}"
[7] "https://www.R-project.org/}"
[8] NA
[9] NA
Which is strange to me. I also included typeof for validation purposes. However the character vector entry
in the loop should be identical to teststring
Edit
Found the solution has to do with list, in bibliography as obtained by write_bib()
Specifiying which element solved it somehow.
name <- str_extract(entry[1],"(?<=\\{).*(?=,)")
res <- sapply( bibliography, function(x) str_extract(paste(x, collapse="\n"), "(?<=\\{).*(?=,)") )
, thennames(res) <- NULL
and check theres
. – Wiktor Stribiżew