I need to generate a matrix with n
columns and m
rows where each cell can be either 0
or 1
, such that sum of numbers in each column equals c
and sum of numbers in each row equals r
. If it's not possible (most of the time probably), it should have columns sum to c+/-d
and rows to r+/-d
so that max(d over all rows and columns) is as small as possible. In other words, every sum of a row should be close to r
and every sum of a column should be close to c
.
To illustrate what I'm looking for when no perfect solution exists, this solution:
1.1.1.......
1.1.1.......
...1..1...1.
...1.1..1...
....1...1..1
is better than this solution:
1.1.1.......
1.1.1.......
...1..1...1.
...1.1..1...
....11..1...
Because the last row has a sum 0 (compared to 1) with the previous solution which is further away from the desired row sum of 2.
Creating a matrix where it holds just for rows is easy - take permutations of a vector with exactly r
1s
.
How to satisfy the second condition though? Find a column with sum that's too high and another that's too low and swap some numbers? Is that going to help, how long will it take, is it even going to terminate, if it's impossible to get a perfect result, when do I stop? Is there a better way?
You can use pseudocode or your language of choice or just random blurbs.
I've found Finding if binary matrix exists given the row and column sums which talks about answering whether it's possible or not and even constructing a solution if it is possible. If it isn't possible such solution will have really large ds though.
There is also this paper https://www.sciencedirect.com/science/article/pii/S0012365X06003980 which I don't really understand but it seems to also be concerned with constructing the perfect solutions