I have a column 'product_list' in a dataframe which looks like this: ";165533;3;1050.00;,;165535;1;700.00;
This is a list of products bought within a purchase id. In the above example 165533 is SKU, 3 is the quantity of products purchased, 1050.00 is the amount of purchase and so on. This field can contain multiple product SKU's. Multiple SKU's are separated by comma. I want to extract only SKU's from this string in R using regex which I am new to.
str = c(";165533;3;1050.00;,;165535;1;700.00;")
I was able to split the SKU's by using below:
strsplit(Type, ",;").
My question is how do I extract only the first value from the comma separated values.
I want the final output to look like this:
Purchase ID SKU
123 165533
123 165535
Is there a better way to extract this data?
Here is the dput output:
dput(Purchase_test[, c(1, 2)]) structure(list(post_purchaseid = c(123L, 456L, 321L, 888L, 345L, 938L, 647L, 657L, 687L, 547L, 647L, 711L, 811L, 911L, 1001L), post_product_list = structure(c(6L, 4L, 11L, 9L, 2L, 5L, 7L, 1L, 3L, 4L, 10L, 8L, 4L, 12L, 13L), .Label = c(";153147;1;100.00;,;165533;1;350.00;,;165537;1;3800.00;", ";153147;1;100.00;,;165533;3;1050.00;,;165531;1;200.00;,;165535;1;700.00;", ";153147;1;100.00;,;165533;3;1050.00;,;165536;1;2750.00;", ";153147;1;100.00;,;165535;1;700.00;", ";153147;1;100.00;,;165535;2;1400.00;", ";153147;1;12.05;,;165531;1;24.11;,;153418;5;500.00;", ";153147;1;15.34;,;165533;1;53.70;", ";153147;1;31.51;,;153418;2;200.00;", ";153147;1;43.84;,;165531;1;87.67;", ";153147;1;49.86;,;165533;1;174.52;", ";165533;3;1050.00;,;165535;1;700.00;", ";creating your first text;1;4200.00;207=4200.00;,;Get started with;1;3900.00;207=3900.00;", ";Get started with;1;3900.00;207=3900.00;"), class = "factor")), class = "data.frame", row.names = c(NA, -15L))
dput
to show the example – akrun123
come from? – Cary Swoveland";153147;1;12.05;,;165531;1;24.11;,;153418;5;500.00;"
, so do you expect the output to have 153147, 165531, 153418 as new rows – akrun