2
votes

I am trying to implement porter stemming algorithm but i am stuck at this point:

Step 1b

(m>0) EED -> EE                    feed      ->  feed
                                   agreed    ->  agree
(*v*) ED  ->                       plastered ->  plaster
                                   bled      ->  bled
(*v*) ING ->                       motoring  ->  motor
                                   sing      ->  sing

Isn't the m of feed equal 1? feed >> [c]vvc[] >>[c]vc[].

If it was so why didn't he convert feed to fee i know it is wrong ,can any one clear that up?

you can check the original algorithim here http://tartarus.org/~martin/PorterStemmer/def.txt

thanks

1
FWIW, the Porter algorithm is already implemented in C as part of the "Snowball" library: snowball.tartarus.org/download.phpBradley Grainger
i know but i am implementing it for fun ,thanks againmike

1 Answers

1
votes

m of 'feed' is indeed 1. Yet, you need to re-read the document carefully. The m in the condition refers to the measure of the stem, that is you need to calculate in after the replacement. In your case to check if feed -> fee is valid, you calculate m(fee) = 0, hence you don't do the replacement.

Also thanks for the algorithm! It was interesting!