regex - How to use separate() properly? -


i have difficulties extract id in form:

27da12ce-85fe-3f28-92f9-e5235a5cf6ac 

from data frame:

a<-c("name_27da12ce-85fe-3f28-92f9-e5235a5cf6ac_thomas_myr",         "name_94773a8c-b71d-3be6-b57e-db9d8740bb98_thimo",          "name_1ed571b4-1aef-3fe2-8f85-b757da2436ee_alex",          "name_9fbeda37-0e4f-37aa-86ef-11f907812397_john_tya",          "name_83ef784f-3128-35a1-8ff9-daab1c5f944b_bishop",          "name_39de28ca-5eca-3e6c-b5ea-5b82784cc6f4_due_to",          "name_0a52a024-9305-3bf1-a0a6-84b009cc5af4_wis_michal",          "name_2520ebbb-7900-32c9-9f2d-178cf04f7efc_sarah_lu_van_gar/thomas") 

basically thing between first , second underscore.

usually approach by:

library(tidyr) df$a<-as.character(df$a) df<-df[grep("_", df$a), ] df<- separate(df, a, c("id","name") , sep = "_") df$a<-as.numeric(df$id) 

however time there many underscores...and approach fails. there way extract id?

i think should use extract instead of separate. need specify patterns want capture. i'm assuming here id starts number i'm capturing after first number until next _ , after

df <- data.frame(a) df <- df[grep("_", df$a),, drop = false] extract(df, a, c("id", "name"), "[a-za-z].*?(\\d.*?)_(.*)") #                                     id                    name # 1 27da12ce-85fe-3f28-92f9-e5235a5cf6ac              thomas_myr # 2 94773a8c-b71d-3be6-b57e-db9d8740bb98                   thimo # 3 1ed571b4-1aef-3fe2-8f85-b757da2436ee                    alex # 4 9fbeda37-0e4f-37aa-86ef-11f907812397                john_tya # 5 83ef784f-3128-35a1-8ff9-daab1c5f944b                  bishop # 6 39de28ca-5eca-3e6c-b5ea-5b82784cc6f4                  due_to # 7 0a52a024-9305-3bf1-a0a6-84b009cc5af4              wis_michal # 8 2520ebbb-7900-32c9-9f2d-178cf04f7efc sarah_lu_van_gar/thomas 

Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -