I am unsure of RIS format but if each element of these strings are separated by commas and then within each comma the header column names are separated by the equal sign then here is a quick and dirty function that uses base R and data.table:
RIS_parser_fn<-function(x){
string_parse_list<-lapply(lapply(x,
function(i) tstrsplit(i,",")),
function(j) lapply(tstrsplit(j,"="),
function(k) t(gsub("\\W","",k))))
datatable_format<-rbindlist(lapply(lapply(string_parse_list,
function(i) data.table(Reduce("rbind",i))),
function(j) setnames(j,unlist(j[1,]))[-1]),fill = T)
return(datatable_format)
}
The first line of code simply creates a list of lists which contain 2 lists of matrices. The outer list has the number of elements equal to the size of the initial vector of strings. The inner list has exactly two matrix elements with the number of columns equal to the number of fields in each string element determined by the ',' sign. The first matrix in each list of lists consists of the columns headers (determined by the '=' sign) and the second matrix contains the values they are equal to. The last gsub simply removes any special characters remaining in the matrices. May need to modify this if you want nonalphanumeric characters to be present in the values. There were not any in your example.
The second line of code converts these lists into one data.table object. The Reduce function simply rbinds the 2 element lists and then converts them to data.tables. Hence there is now only one list consisting of data.tables for each initial string element. The "j" lapply function sets the column names to the first row of the matrix and then removes that row from the data.table. The final rbindlist call combines the list of the data.tables which have varying number of columns. Set the fill=T to allow them to be combined and NAs will be assigned to cells that do not have that particular field.
I added a second string element with one more field to test the code:
PubStr<-c("TY = \"txtTY1\",TI = \"txtTI1\"","TY = \"txtTY2\",TI = \"txtTI2\" ,TF = \"txtTF2\"")
RIS_parser_fn(PubStr)
Returns this:
TY TI TF
1: txtTY1 txtTI1 <NA>
2: txtTY2 txtTI2 txtTF2
Hopefully this will help you out and/or stimulate some ideas for more efficient code. Best of luck!