bioinformatics - Trim DNA sequence using R -
i have dna sequence files , many sequences start "cccatgcagacatagtg" or "ctccatgcagacatagtg" , have tag sequence "atgca". want remove "atgca" "cc" , "ctc". final product "gacatagtg".
does know r function can that? tried trimlrpatterns in biostrings not work since trim end not within sequence. please let me know if have solution that. thanks.
try this:
# dummy dna mydna <- c("cccatgcagacatagtg","ctccatgcagacatagtg") # define tag tag <- "atgca" # remove character(s) before tag, including tag. gsub(paste0("^.*",tag),"",mydna) # output # [1] "gacatagtg" "gacatagtg"
Comments
Post a Comment