Frequency cipher in F#

Question

I'm currently working on a frequency-substitution cipher in F#. Meaning that I count all the occurrences of each letter in a text and when that is done I want to replace the letters based on the letter frequency in the english alphabet.

What i've done so far is that i've created a (char * float * char) list that contains (letter, frequency percentage, recommended letter). Let's say the letter P is the most occurring letter in my ciphered text (13.5 percent of letters are P) and E is the most used letter in english texts our list element will look like this ('P', 13.5, 'E'). This procedure is done all letters in the text, so we will end up with a list of all letters and their recommended replacement.

The problem I have is that I don't really know how to replace the letters in the cipher text with their recommended replacements.

Letter frequency in the english alphabet.
[(' ', 20.0); ('E', 12.02); ('T', 9.1); ('A', 8.12); ('O', 7.68); ('I', 7.31);
 ('N', 6.95); ('S', 6.28); ('R', 6.02); ('H', 5.92); ('D', 4.32); ('L', 3.98);
 ('U', 2.88); ('C', 2.71); ('M', 2.61); ('F', 2.3); ('Y', 2.11); ('W', 2.09);
 ('G', 2.03); ('P', 1.82); ('B', 1.49); ('V', 1.11); ('K', 0.69); ('X', 0.17);
 ('Q', 0.11); ('J', 0.1); ('Z', 0.07)]


Letter frequency in cipher.
[('W', 21.18); ('Z', 8.31); ('I', 7.7); ('P', 6.96); ('Y', 5.5); ('H', 5.48);
 ('G', 5.35); ('K', 5.3); ('N', 4.31); ('O', 4.31); ('M', 3.66); (' ', 2.83);
 ('A', 2.58); ('T', 2.38); ('Q', 2.22); ('B', 2.11); ('F', 2.11); ('.', 2.04);
 ('R', 1.62); ('S', 1.37); ('E', 1.06); ('X', 0.97); ('U', 0.25); ('L', 0.16);
 ('V', 0.11); ('J', 0.07); ('C', 0.02); ('D', 0.02)]


Recommended letter changes.
[('W', 21.18, ' '); ('Z', 8.31, 'E'); ('I', 7.7, 'T'); ('P', 6.96, 'A');
 ('Y', 5.5, 'O'); ('H', 5.48, 'I'); ('G', 5.35, 'N'); ('K', 5.3, 'S');
 ('N', 4.31, 'R'); ('O', 4.31, 'H'); ('M', 3.66, 'D'); (' ', 2.83, ' ');
 ('A', 2.58, 'L'); ('T', 2.38, 'U'); ('Q', 2.22, 'C'); ('B', 2.11, 'M');
 ('F', 2.11, 'F'); ('.', 2.04, 'Y'); ('R', 1.62, 'W'); ('S', 1.37, 'G');
 ('E', 1.06, 'P'); ('X', 0.97, 'B'); ('U', 0.25, 'V'); ('L', 0.16, 'K');
 ('V', 0.11, 'X'); ('J', 0.07, 'Q'); ('C', 0.02, 'J'); ('D', 0.02, 'Z')]

If anyone have any ideas that would put me in the right direction on how to tackle the problem i'd be very appreciative since i've been stuck on this problem for some while now.

There is something wrong with your sample data. Two lists have non-equal length. And you have whitespace twice in your recommended letter changes ('W', 21.18, ' ') and (' ', 2.83, ' ') — Sergey Berezovskiy

Sergey Berezovskiy Sergey Berezovskiy · Accepted Answer · 2020-05-30T21:20:58

I believe you are missing . frequency in the English alphabet (should be between D and L. When you'll add missing value to alphaFreq list both lists will be of same length and you'll be able to produce recommended changes map by zipping two ordered lists:

let changes =
    alphaFreq // list with letter frequency in the English alphabet
    |> List.zip cipherFreq // zipping with cipher frequency list
    |> List.map (fun ((cipherLetter,_), (alphaLetter,_)) -> (alphaLetter, cipherLetter))
    |> Map.ofList

Encoding test:

"HELLO WORLD" |> String.map (fun ch -> changes.[ch]) |> printfn "%s"
// OZAAYWRYNAM

To get a decoder map just swap letter order -> (cipherLetter, alphaLetter)

Frequency cipher in F#

2 Answers