text cleaning in R -

- August 15, 2014

i have single column in r looks this:

path column ag.1.4->ao.5.5->iv.9.12->ag.4.35 ao.11.234->iv.345.455.1.2->ag.9.531

i want transform into:

path column ag->ao->iv->ag ao->iv->ag

how can this?

thank you

here full dput data:

structure(list(rank = c(10394749l, 36749879l), count = c(1l,  1l), percent = c(0.001011122, 0.001011122), path = c("ao.legacy payment.not_completed->ao.legacy payment.not_completed->ao.legacy payment.completed",  "ao.legacy payment.not_completed->agent.payment.completed")), .names = c("rank",  "count", "percent", "path"), class = "data.frame", row.names = c(na,  -2l))

you use gsub match . , numbers following . (\\.[0-9]+) , replace ''.

 df1$path.column <- gsub('\\.[0-9]+', '', df1$path.column)  df1  #           path.column  #1 ag -> ao -> iv -> ag  #2       ao -> iv -> ag

update

for new dataset df2

gsub('\\.[^->]+(?=(->|\\b))', '', df2$path, perl=true) #[1] "ao->ao->ao" "ao->agent"

and string showed in op's post

str2 <- c('ag.1.4->ao.5.5->iv.9.12->ag.4.35',     'ao.11.234->iv.345.455.1.2->ag.9.531')  gsub('\\.[^->]+(?=(->|\\b))', '', str2, perl=true)  #[1] "ag->ao->iv->ag" "ao->iv->ag"

data

df1 <- structure(list(path.column = c("ag.1 -> ao.5 -> iv.9 -> ag.4",  "ao.11 -> iv.345 -> ag.9")), .names = "path.column",  class = "data.frame", row.names = c(na, -2l))  df2  <- structure(list(rank = c(10394749l, 36749879l), count = c(1l,  1l), percent = c(0.001011122, 0.001011122),  path = c("ao.legacy payment.not_completed->ao.legacy payment.not_completed->ao.legacy payment.completed",  "ao.legacy payment.not_completed->agent.payment.completed")),  .names = c("rank", "count", "percent", "path"), class = "data.frame",  row.names = c(na, -2l))

Search This Blog

Print F

text cleaning in R -

update

data

Comments

Post a Comment

Popular posts from this blog

node.js - How to mock a third-party api calls in the backend -

node.js - Why do I get "SOCKS connection failed. Connection not allowed by ruleset" for some .onion sites? -

matlab - 0-by-1 sym - What do I need to change in order to get proper symbolic results? -