7
votes

An array is said to have a majority element if more than half of its elements are the same. Is there a divide-and-conquer algorithm for determining if an array has a majority element?

I normally do the following, but it is not using divide-and-conquer. I do not want to use the Boyer-Moore algorithm.

int find(int[] arr, int size) {
    int count = 0, i, mElement;

    for (i = 0; i < size; i++) {
        if (count == 0) mElement = arr[i];

        if (arr[i] == mElement) count++;
        else count--;
    }

    count = 0;
    for (i = 0; i < size; i++) {
        if (arr[i] == mElement) count++;
    }

    if (count > size / 2) return mElement;
    return -1;
}
4
Why don't you want to use Boyer-Moore, it's so efficient ? - Yves Daoust
i actually want to do it using divide and conquer algorithm , cos of that reason i do not wish to use boyer-moore algorithm - user4652599
I don't think this is such a good idea: if you split the array in two halves, the most frequent element may not be the most frequent in the two halves. For instance, CCCAAAAABBBBBCCC. - Yves Daoust
@YvesDaoust: That example has no majority element (a majority element would occur at least 9 times out of 16). - MSalters
@MSalters: the same restriction applies to the majority element: it won't be majoritarian in all subdivisions and D&Q recurrence isn't straightforward. - Yves Daoust

4 Answers

4
votes

I can see at least one divide and conquer method.

Start by finding the median, such as with Hoare's Select algorithm. If one value forms a majority of the elements, the median must have that value, so we've just found the value we're looking for.

From there, find (for example) the 25th and 75th percentile items. Again, if there's a majority element, at least one of those would need to have the same value as the median.

Assuming you haven't ruled out there being a majority element yet, you can continue the search. For example, let's assume the 75th percentile was equal to the median, but the 25th percentile wasn't.

When then continue searching for the item halfway between the 25th percentile and the median, as well as the one halfway between the 75th percentile and the end.

Continue finding the median of each partition that must contain the end of the elements with the same value as the median until you've either confirmed or denied the existence of a majority element.

As an aside: I don't quite see how Boyer-Moore would be used for this task. Boyer-Moore is a way of finding a substring in a string.

4
votes

There is, and it does not require the elements to have an order.

To be formal, we're dealing with multisets (also called bags.) In the following, for a multiset S, let:

  • v(e,S) be the multiplicity of an element e in S, i.e. the number of times it occurs (the multiplicity is zero if e is not a member of S at all.)
  • #S be the cardinality of S, i.e. the number of elements in S counting multiplicity.
  • ⊕ be the multiset sum: if S = LR then S contains all the elements of L and R counting multiplicity, i.e. v(e;S) = v(e;L) + v(e;R) for any element e. (This also shows that the multiplicity can be calculated by 'divide-and-conquer'.)
  • [x] be the largest integer less than or equal to x.

The majority element m of S, if it exists, is that element such that 2 v(m;S) > #S.

Let's call L and R a splitting of S if LR = S and an even splitting if |#L - #R| ≤ 1. That is, if n=#S is even, L and R have exactly half the elements of S, and if n is odd, than one has cardinality [n/2] and the other has cardinality [n/2]+1.

For an arbitrary split of S into L and R, two observations:

  1. If neither L nor R has a majority element, then S cannot: for any element e, 2 v(e;S) = 2 v(e;L) + 2 v(e;R) ≤ #L + #R = #S.

  2. If one of L and R has a majority element m with multiplicity k, then it is the majority element of S only if it has multiplicity r in the other half, with 2(k+r) > #S.

The algorithm majority(S) below returns either a pair (m,k), indicating that m is the majority element with k occurrences, or none:

  1. If S is empty, return none; if S has just one element m, then return (m,1). Otherwise:
  2. Make an even split of S into two halves L and R.
  3. Let (m,k) = majority(L), if not none:

    a. Let k' = k + v(m;R).

    b. Return (m,k') if 2 k' > n.

  4. Otherwise let (m,k) = majority(R), if not none:

    a. Let k' = k + v(m;L).

    b. Return (m,k') if 2 k' > n.

  5. Otherwise return none.

Note that the algorithm is still correct even if the split is not an even one. Splitting evenly though is likely to perform better in practice.

Addendum

Made the terminal case explicit in the algorithm description above. Some sample C++ code:

struct majority_t {
    int m; // majority element
    size_t k; // multiplicity of m; zero => no majority element

    constexpr majority_t(): m(0), k(0) {}
    constexpr majority_t(int m_,size_t k_): m(m_), k(k_) {}

    explicit operator bool() const { return k>0; }
};

static constexpr majority_t no_majority;

size_t multiplicity(int x,const int *arr,size_t n) {
    if (n==0) return 0;
    else if (n==1) return arr[0]==x?1:0;

    size_t r=n/2;
    return multiplicity(x,arr,r)+multiplicity(x,arr+r,n-r);
}

majority_t majority(const int *arr,size_t n) {
    if (n==0) return no_majority;
    else if (n==1) return majority_t(arr[0],1);

    size_t r=n/2;
    majority_t left=majority(arr,r);
    if (left) {
        left.k+=multiplicity(left.m,arr+r,n-r);
        if (left.k>r) return left;
    }

    majority_t right=majority(arr+r,n-r);
    if (right) {
        right.k+=multiplicity(right.m,arr,r);
        if (right.k>r) return right;
    }

    return no_majority;
}
2
votes

A simpler divide and conquer algorithm works for the case that there exists more than 1/2 elements which are the same and there are n = 2^k elements for some integer k.

FindMost(A, startIndex, endIndex)
{  // input array A

if (startIndex == endIndex)  // base case
        return A[startIndex]; 
x = FindMost(A, startIndex, (startIndex + endIndex - 1)/2);
y = FindMost(A, (startIndex + endIndex - 1)/2 + 1, endIndex);

if (x == null && y == null) 
    return null;
else if (x == null && y != null) 
    return y;
else if (x != null && y == null) 
    return x;
else if (x != y) 
    return null;
else return x

}

This algorithm could be modified so that it works for n which is not exponent of 2, but boundary cases must be handled carefully.

0
votes

Lets say the array is 1, 2, 1, 1, 3, 1, 4, 1, 6, 1.

If an array contains more than half of elements same then there should be a position where the two consecutive elements are same.

In the above example observe 1 is repeated more than half times. And the indexes(index start from 0) index 2 and index 3 have same element.