0
votes

I want to split a 2D array this way:

Example:

From this 4x4 2D array:

np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])

Create these five 2x2 2D arrays, with unitary displacement (shift):

np.array([[1,2],[3,4]])
np.array([[4,5],[6,7]])
np.array([[7,8],[9,10]])
np.array([[10,11],[12,13]])
np.array([[13,14],[15,16]])

In a general case, from an NXN 2D array (square arrays) create Y 2D arrays of MXM shape, as many as possible.

Just to be more precise: to create the output array, not necessarily it will be made of all values from the row.

Example:

From a 2D 8x8 array, with values from 1 to 64, if I want to split this array in 2D 2x2 arrays, the first row from 8x8 array is a row from 1 to 8, and the first output 2D 2x2 array will be np.array([[1,2],[3,4]]), and the second output 2D 2x2 array will be np.array([[4,5],[6,7]])... It continues until the last output 2D array, that will be np.array([[61,62],[63,64]]). Look that each 2D 2x2 array was not filled with all the values from the row (CORRECT). And that exists a unitary displacement (shift) from previous array to next array.

There is a Numpy method that do this?

MSeifert answered here (How to split an 2D array, creating arrays from "row to row" values) a question that solves almost 95% of this question, except the unitary displacement (shift) part.

So, from the 4x4 2D array example:

np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])

Instead of create these FOUR 2x2 2D arrays (without unitary shift/displacement):

np.array([[1,2],[3,4]])
np.array([[5,6],[7,8]])
np.array([[9,10],[11,12]])
np.array([[13,14],[15,16]])

Create these FIVE 2x2 2D arrays (with unitary shift/displacement):

np.array([[1,2],[3,4]])
np.array([[4,5],[6,7]])
np.array([[7,8],[9,10]])
np.array([[10,11],[12,13]])
np.array([[13,14],[15,16]])

And, of course, it should work for the general case of, given a square NXN 2D array, to create Y MXM square 2D arrays.

Example: from a 60x60 square 2d array, create Y MXM square 2D arrays (10x10, for example).

Plus: I need to know what is the rule that relates the number of points of the original square 2D array (4x4 2D array, in the example), with points of mini square 2D arrays (2X2 2D arrays, in the example). In the case, given 16 points (4x4 2D array), it is possible create 5 2x2 2D arrays (each one with 4 points).

1
This problem is ill-posed as stated. You cannot generalize to MxM for any shape of input array. How would you create 3x3 arrays out of the original example array? [[1 2 3] [4 5 6] [7 8 9]], [[9 10 11] [12 13 14] [15 16 ??]] The example only works because N=M^2. Your 60x60 array with 10x10 blocks for e.g. would not work.alkasm
@AlexanderReynolds Sure i can't generalize, i am just inquiring, given an original 2D array, with which scales (mini arrays) I can work. In my example, i can say that given a 4x4 2D array, I can generate 2x2 2D arrays.Marco
@AlexanderReynolds So... 2 questions: 1) How can I know (mathematical rule), given a 2D array, for which scales I can generate mini 2D arrays (and, sure, which algorithm generates the mini arrays with the quoted displacement/shift)? 2) Regarding the above question, considering that there is a mathematical rule, I can also know, given a possible scale (generation of mini arrays), how many mini arrays are possible to be generated for this scale?Marco

1 Answers

3
votes

The condition for the subarrays to exactly fit is (M+1)*(M-1) divides (N+1)*(N-1), the number of subarrays you can put is the quotient of these numbers. Note that these numbers are equal to M*M-1 and N*N-1. In this form the rule also applies to non square matrices.

Examples

M      N      M*M-1  N*N-1  Y
-----------------------------
3      5      8      24     3
3      7      8      48     6
5      7      24     48     2
4      11     15     120    8

Implementation: Please note that this returns overlapping views into the original array. If you want to modify them you may want to make a copy. Also note that this implementation fits as many subsquares as fit, any leftover elements in the larger matrix are dropped.

Update: I've added two functions which calculate given N all possible M and vice versa.

Output:

# Testing predictions ...
# ok

# M = 105
# solutions: [105, 1273, 1377, 4135, 4239, 5407, 5511, 5513]
# this pattern repeats at offsets 5512, 11024, 16536, ...

# N = 1000001
# solutions: [2, 3, 4, 5, 7, 9, 11, 31, 49, 1000001]

# example N, M = (5, 3)
# [[[ 0  1  2]
#   [ 3  4  5]
#   [ 6  7  8]]

#  [[ 8  9 10]
#   [11 12 13]
#   [14 15 16]]

#  [[16 17 18]
#   [19 20 21]
#   [22 23 24]]]

Code:

import numpy as np
import sympy
import itertools as it
import functools as ft
import operator as op

def get_subsquares(SqNN, M0, M1=None):
    M1 = M0 if M1 is None else M1
    N0, N1 = SqNN.shape
    K = (N0*N1-1) // (M0*M1-1)
    SqNN = SqNN.ravel()
    s, = SqNN.strides
    return np.lib.stride_tricks.as_strided(SqNN, (K, M0, M1),
                                           (s*(M0*M1-1), s*M1, s))


def get_M_for_N(N):
    """Given N return all possible M
    """
    assert N >= 2
    f = 1 + (N & 1)
    factors = sympy.factorint((N+1)//f)
    factors.update(sympy.factorint((N-1)//f))
    if f == 2:
        factors[2] += 2
    factors = [ft.reduce(op.mul, fs) for fs in it.product(
        *([a**k for k in range(n+1)] for a, n in factors.items()))]
    return [fs + 1 for fs in sorted(set(factors) & set(fs - 2 for fs in factors)) if (N*N-1) % (fs * (fs+2)) == 0]

def get_N_for_M(M):
    """Given M return all possible N in the form period, smallest

    smallest is a list of all solutions between M and M*M if M is even
    and between M and (M*M+1) / 2 if M is odd, all other solutions can be
    obtained by adding multiples of period
    """
    assert M >= 2
    f = 1 + (M & 1)
    factors = sympy.factorint((M+1)//f)
    factors.update(sympy.factorint((M-1)//f))
    factors = [k**v for k, v in factors.items()]
    rep = (M+1)*(M-1) // f
    f0 = [ft.reduce(op.mul, fs) for fs in it.product(*zip(it.repeat(1), factors))]
    f1 = [rep // (f*a) for a in f0]
    inv = [f if b==1 else f*b + 2 if a==1 else 2 * sympy.mod_inverse(a, b)
           for a, b in zip(f1, f0)]
    if f==1:
        inv[1:-1] = [a%b for a, b in zip(inv[1:-1], f0[1:-1])]
    return rep, sorted(a*b - 1 for a, b in zip(f1, inv))

def test_predict(N):
    def brute_force(a, b):
        return [i for i in range(a, b) if (i*i-1) % (a*a-1) == 0]
    for x in range(2, N+1):
        period, pred = get_N_for_M(x)
        assert brute_force(x, period*4+2) \
            == [a + b for b in range(0, 4*period, period) for a in pred]
    def brute_force(b):
        return [i for i in range(2, b+1) if (b*b-1) % (i*i-1) == 0]
    for x in range(2, N+1):
        assert brute_force(x) == get_M_for_N(x)
    print('ok')

# test
print("Testing predictions ...")
test_predict(200)
print()
# examples
M = 105
period, pred = get_N_for_M(M)
print(f"M = {M}")
print(f"solutions: {pred}")
print(f"this pattern repeats at offsets {period}, {2*period}, {3*period}, ...")
print()
N = 1000001
pred = get_M_for_N(N)
print(f"N = {N}")
print(f"solutions: {pred}")
print()
N, M = 5, 3
SqNN = np.arange(N*N).reshape(N, N)
print(f"example N, M = ({N}, {M})")
print(get_subsquares(SqNN, M))