I know this is a late reply. But i think future comers can get help from it.
Below is the answer i think i understood from above passage (All codes are in OpenCV-Python v 2.4-beta):
I take this as input image. It is a simple image for sake of understanding.

First we generate the binary image of the give image by thresholding it at 80% of its intensity and inverting the resulting image.
import cv2
import numpy as np
img = cv2.imread('doc4.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0.8*gray.max(),255,1)
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
Thresholded image :

We considered simple 8-neighborhood connectivity and performed connected component (contour) analysis of the binary image leading to the segmentation of the textual components.
It is simply contour finding in OpenCV, also called connected-component labelling.It selects all white blobs(components) in the image.
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
Contours :

For next part of algorithm we use the minimum bounding rectangle of contours.
Now we find bounding rectangles around each detected contours. Then remove contours with small areas to remove commas etc. See the statement:
Smaller connected patterns were discarded based on the assumption that they may have originated due to noise dependent on image acquisition system and does not in any way contribute to the final layout. Also punctuation marks were neglected using smaller size criterion e.g. comma, full-stop etc.
We also find the average height, avgh.
height = 0
num = 0
letters = []
ht = []
for (i,cnt) in enumerate(contours):
(x,y,w,h) = cv2.boundingRect(cnt)
if w*h<200:
cv2.drawContours(thresh2,[cnt],0,(0,0,0),-1)
else:
cv2.rectangle(thresh2,(x,y),(x+w,y+h),(0,255,0),1)
height = height + h
num = num + 1
letters.append(cnt)
ht.append(h)
avgh = height/num
So after this all commas etc are removed, and green rectangles drawn around selected ones:

At this level we also segregate the fonts based on the height of the bounding rect using avgh (average height) as threshold. Two thresholds are used to classify fonts into three categories - small, normal and large (as per given equations in passage).
average height, avgh, obtained here is 40. So one letter is small if height is less than 26.66 (ie 40x2/3), normal if 26.66large if height>60. But in the given image, all heights fall between (28,58), so all are normal. So you can't see the difference.
So i just made a small modification to easily visualize it : small if height<30 , normal if 3050.
for (cnt,h) in zip(letters,ht):
print h
if h<=30:
cv2.drawContours(thresh2,[cnt],0,(255,0,0),-1)
elif 30 < h <= 50:
cv2.drawContours(thresh2,[cnt],0,(0,255,0),-1)
else:
cv2.drawContours(thresh2,[cnt],0,(0,0,255),-1)
cv2.imshow('img',thresh2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Now you get the result with letters categorized to small,normal,large:

These rectangles were then sorted top-to-bottom and left-to-right order, using 2D point information of leftmost-topmost corner.
This part i have omitted. It is just sorting of all bounding rects wrt their leftmost-topmost corner.