I am trying to generate a nucleotide motif that will code chosen amino acids. For example - histidine is coded by CAT, CAC. Arginine is CGT, CGC, CGA, CGG,AGA and AGG. The pattern is:
position in codon - C or A
position in codon - A or G
position - A, T, C or G
by that rule you can define chosen amino acids (H and R) but also the amino acids that i dont want (for example AAA is lysine, AAT is asparagine...). So I need to define the pattern that matches only my chosen AAs, in case above it can be: [C][A or G][T], that pattern defines only histidine and arginine, but not the other amino acids. I am trying to work out an algorithm which will do this thing with any amino acids which i choose (more than two) and if the pattern does not exist it should find the possibilities for less amino acids (for example if pattern for 5 amino acids does not exist, it will find the patterns for four amino acids from the query) - this final optimization problem is probably the hardest part. Any suggestions? Thanks a lot and sorry for my poor english.