1
votes

Similar question to this [1]porter stemming algorithm implementation question?, but expanded.

Basically, step1b is defined as:

Step1b

`(m>0) EED -> EE                    feed      ->  feed
                               agreed    ->  agree
(*v*) ED  ->                       plastered ->  plaster
                               bled      ->  bled
(*v*) ING ->                       motoring  ->  motor
                               sing      ->  sing `

My question is why does feed stem to feed and not fe? All the online Porter Stemmer's I've tried online stems to feed, but from what I see, it should stem to fe.

My train of thought is:

`feed` does not pass through     `(m>0) EED -> EE` as measure of     `feed` minus suffix     `eed` is `m(f)`, hence     `=0`

`feed` will pass through     `(*v*) ED  ->`, as there is a vowel in the stem     `fe` once the suffix     `ed` is removed. So will stem at this point to     `fe`

Can someone explain to me how online Porter Stemmers manage to stem to feed?

Thanks.

3
Your question is not similar, it is exactly the same. - axiom
No, it isn't the same question. The referred post asks about the measure of feed, while this one asks why feed is not converted to fe - geekazoid
Not completely sure, but I think that (*v*) refers to having a vowel and something else to the right. Which would be equivalent to having m > 1... - geekazoid
But checking NLTK implementation, it maps feed to feed but tried to tri. I can't understand what's going on - geekazoid

3 Answers

0
votes

It's because "feed" doesn't have a VC (vowel/consonant) combination, therefore m = 0. To remove the "ed" suffix, m > 0 (check the conditions for each step).

0
votes

The rules for removing a suffix will be given in the form (condition) S1 -> S2 This means that if a word ends with the suffix S1, and the stem before S1 satisfies the given condition, S1 is replaced by S2. The condition is usually given in terms of m, e.g. (m > 1) EMENT -> Here S1 is `EMENT' and S2 is null. This would map REPLACEMENT to REPLAC, since REPLAC is a word part for which m = 2. now, in your example : (m>0) EED -> EE feed -> feed before 'EED', are there vowel(s) followed by constant(s), repeated more than zero time?? answer is no, befer 'EED' is "F", there are not vowel(s) followed by constant(s)

0
votes

In feed m refers to vowel,consonant pair. there is no such pair.

But in agreed "VC" is ag. Hence it is replaced by agree. The condition is m>0.

Here m=0.