1
votes

I have an Iterable object called values (Iterable values), and I want to add them to a list of distinct elements.

for (Text val : values) {
    if (!mylist.contains(val)) {
                mylist.add(val);
    }
}

It onlu adds one element to this list. If I remove that condition to check for distinctness, I see that all the elements are repeated.

I have tried many things, I thought maybe I should use a .get() method like this

for (Text val : values) {
    if (!mylist.contains(val.get())) {
                mylist.add(val.get());
    }
}

but then Java gives this error, that symbol val not found:

>editorPairs.java:67: cannot find symbol
>symbol  : method get()
>location: class org.apache.hadoop.io.Text
>                    mylist.add(val.get());
>                                  ^
>1 error

The full code is below:

public void reduce(Text key, Iterable<Text> values, Context context)
                throws IOException, InterruptedException {

        List<Text> mylist = new ArrayList<Text>();

        for (Text val : values) {
            if (!mylist.contains(val)) {
                mylist.add(val);
            }
        }

        if(mylist.size() > 1) {
            int size = mylist.size();
            for (int i=0; i<size; ++i) {
                Text t1 = mylist.get(i);
                context.write(t1, t1);
            }
        }
}
2
Why not use a Set? Also, context.write(t1, t1); is supposed to do what?Elliott Frisch
I tried set too, but it also happens with Set as well. I know that set can only contains distinct values, but in my hadoop program, the outputs were the same.Vahid Mirjalili

2 Answers

1
votes

We need to use [Set][1] to get the distinct values as [set][1] doesn't add the value if it already exists (hence, no need to check for contains()). Now, to allow set to determine the unique values, we need to override equals() and hashCode() method in our class (Text in our case).

This example explains what needs to be done.

0
votes

the better thing to do is to use a set.

instantiate a HashSet that use equals method of your object to add values only if distint.