2
votes

Suppose I have a list of tuples whose elements are are all possible pairings from a list:

i.e. matchup=[('Mike','John'),('Mike','Mary'),('Mike','Jane'),('John','Mary'),('John','Jane'),('Mary','Jane')...]

I'd like to reduce the list such that each person's name appears twice irrespective of whether they are the first element in the pairing or the second. A tuple element may be selected more than twice if it is not possible to create a new pair without doing so.

Thanks in advance.

edit: originally in the list, I used a for loop to pair each person with another person at random ala:

list=["John","Mike","Mary","Jane"]
pairing=[]
for person in list:
    for i in range(2):
        person2=random.sample(list(list),1)
        this_match=str(person)+str(person2)
        while this_match in pairing:
            person2=random.sample(list(list),1)
            this_match=str(person)+str(person2)
        pairing.append(this_match)

This led to same person duplications. My second try is this:

from itertools import combinations
import pandas as pd
from collections import Counter

possible_games = combinations(list, 2)

games = list(possible_games)
dupe_check=Counter(games)
print(dupe_check)
print (games, len(games))

However, I cannot reduce the elements of each tuple to appear as close to twice as possible.

One possible output might look like:

[('Mike','John'),('Mike','Mary'),('John','Mary'),("Mary","Jane"),("Jane","Mike")]

John appears twice. Jane appears twice. Mike appears three times in order for Jane to appear twice. Mary appears three times for Jane to appear twice.

2
What have you tried, And also add how the output should look, That should provide more clarity.Vineeth Sai
In your possible output, Mike and Mary are repeated 3 times, whyVineeth Sai
because someone must be paired with Jane for her to be repeated twice, Mike was selected randomly. Alternatively, John could've been selected as he only appeared twice. Mary would not have been selected because she has already appeared three times.machump
I see it as a dfs of graph where nodes are the name :)BlueSheepToken

2 Answers

1
votes

The easiest way to get each name exactly twice is the following, I guess:

lst = ["John", "Mike", "Mary", "Jane"]  # not shadowing 'list'

pairs = list(zip(lst, lst[1:]+lst[:1]))
pairs
# [('John', 'Mike'), ('Mike', 'Mary'), ('Mary', 'Jane'), ('Jane', 'John')]

This essentially circles the list and pairs each element with its two neighbours. If you need more randomness, you can shuffle the list beforehand or break up the list in chunks and apply this to the chunks.

1
votes

The following code will solve your problem fully. result will give you the answer in this code.

import itertools
import random
import numpy as np

# lst is a list of names that I have chosen.
lst = ['Apple', 'Boy', 'Cat', 'Dog', 'Eagle']

# create a list of tuples (pairs of names).
matchup = list(itertools.product(lst, lst)) 

# randomly shuffle the pairs of names.
random.shuffle(matchup)


def func(inp):
    out = []
    out += [ inp[0] ]

    # Unique array of names.
    unq = np.unique( (zip(*inp))[0] )

    # Stores counts of how many times a given name features in the final list of tuples.
    counter = np.zeros(len(unq))

    indx0 = np.where( out[0][0]==unq )[0][0]
    indx1 = np.where( out[0][1]==unq )[0][0]    
    counter[indx0]+=1
    counter[indx1]+=1    

    reserve = []

    #first try of filling output list with tuples so that no name enters the output list more than once.   
    for i in range(1,len(matchup)):
        tup = matchup[i]

        indx0 , indx1 = np.where(tup[0]==unq)[0][0], np.where(tup[1]==unq)[0][0]

        temp = counter.copy()

        temp[indx0]+=1
        temp[indx1]+=1

        if ( (temp[indx0]<=2) and (temp[indx1]<=2) ):
            out += [tup]
            counter[indx0]+=1
            counter[indx1]+=1

        else: reserve += [tup]     

    #A tuple element may be selected more than twice if it is not possible to create a new pair without doing so.    
    while(np.any(counter==1)):
        tup = reserve[0]

        indx0 , indx1 = np.where(tup[0]==unq)[0][0], np.where(tup[1]==unq)[0][0]

       # Create a copy of counter array. 
       temp = counter.copy()

        if ( (temp[indx0]<2) or (temp[indx1]<2) ):
            out += [tup]
            counter[indx0]+=1
            counter[indx1]+=1 

        reserve.pop(0)    

    return out  

result = func(matchup)
print (result)

The output of result will vary in different runs because list of tuples (of names) is randomly shuffled in each run. One example of result is the following.

[('Cat', 'Dog'), ('Eagle', 'Boy'), ('Eagle', 'Dog'), ('Cat', 'Boy'), ('Apple', 'Apple')]