sql - Number of palindromes in character strings -


i'm trying gather list of 6 letter palindromes , number of times occur using postgres 9.3.5.

this query i've tried:

select word, count(*) ( select regexp_split_to_table(read_sequence, '([atcg])([atcg])([atcg])(\3)(\2)(\1)') word        reads ) t group word; 

however brings results a) aren't palindromic , b) greater or less 6 letters long.

\d reads table "public.reads" column        |  type   | modifiers  --------------+---------+----------- read_header   | text    | not null read_sequence | text    |  option        | text    |  quality_score | text    |  pair_end      | text    | not null species_id    | integer |   indexes: "reads_pkey" primary key, btree (read_header, pair_end) 

read_sequence contains dna sequences, 'atgctgatgcggcgtagctggatcga' example.

i'd see number of palindromes in each sequence example contain 1 sequence have 4 3 , on.

count per row:

select read_header, pair_end, substr(read_sequence, i, 6) word, count(*) ct   reads r      , generate_series(1, length(r.read_sequence) - 5 )  substr(read_sequence, i, 6) ~ '([atcg])([atcg])([atcg])\3\2\1' group  1,2,3 order  1,2,3,4 desc; 

count per read_header , palindrome:

select read_header, substr(read_sequence, i, 6) word, count(*) ct ... group  1,2 order  1,2,3 desc; 

count per read_header:

select read_header, count(*) ct ... group  1 order  1,2 desc; 

count per palindrome:

select substr(read_sequence, i, 6) word, count(*) ct ... group  1 order  1,2 desc; 

sql fiddle.

explain

a palindrome start @ position 5 characters shy of end allow length of 6. , palindromes can overlap. so:

  1. generate list of possible starting positions generate_series() in lateral join, , based on possible 6-character strings.

  2. test palindrome regular expression references, similar had, regexp_split_to_table() not right function here. use regular expression match (~).

  3. aggregate, depending on want.


Comments

Popular posts from this blog

java - Could not locate OpenAL library -

c++ - Delete matches in OpenCV (Keypoints and descriptors) -

sorting - opencl Bitonic sort with 64 bits keys -