0
votes

I am trying to count the number of inversions that happen in the following implementation of a mergesort algorithm in Python, which uses just one simple function definition (i.e., one def statement for mergesort). I have commented this mergesort algorithm for clarity:

import csv

def safeint(val):
   try:
      return int(val)
   except ValueError:
      return val

list = []
with open('file.txt') as f:
    lines = csv.reader(f, delimiter='\n')
    for line in lines:
        line = map(safeint, line)
        list.append(line)

def mergesort(list):
    mid = len(list)//2  #MIDPOINT FOR DIVISION
    lft, rgt = list[:mid], list[mid:]
    if len(lft) > 1: lft = mergesort(lft)   #SORT BY HALVES
    if len(rgt) > 1: rgt = mergesort(rgt)
    res = []
    while lft and rgt:  #NEITHER HALF IS EMPTY
        if lft[-1] >=rgt[-1]:   #lft HAS GREATEST LAST VALUE
            res.append(lft.pop())   #APPEND IT
        else:   #rgt HAS GREATEST LAST VALUE
            res.append(rgt.pop())   #APPEND IT
    res.reverse()   #RESULT IS BACKWARD
    return (lft or rgt) + res   #ALSO ADD THE REMAINDER

print mergesort(list)

My input, file.txt, is of the form:

1
2
3
4
5
55
60
82
19

My output is (as expected):

[[1], [2], [3], [4], [5], [19], [55], [60], [82]]

Is it possible to incorporate an "inversions calculator" into this code without adding extra def statements? There's already plenty multi-function examples online, for example: Counting Inversions Using Merge Sort, https://codereview.stackexchange.com/questions/12922/inversion-count-using-merge-sort

Can we be more succinct than this?

1

1 Answers

0
votes

Python-ish pseudocode:

while lft and rgt:  #NEITHER HALF IS EMPTY
    if lft[-1] >=rgt[-1]:   #lft HAS GREATEST LAST VALUE
        # if the last of lft is greater than the last of rgt (which is sorted),
        # then it is also greater than everything before the last element of rgt, 
        # so it generates as many inversions as the remaining elements in rgt
        inversions += len(rgt)
        res.append(lft.pop())   #APPEND IT
    else:   #rgt HAS GREATEST LAST VALUE
        res.append(rgt.pop())   #APPEND IT

Where inversions is either a global, a parameter (for example make it a 1 element list so it's mutable) or something that you return (make sure to return the sum of inversions in both halves in that case).