You can do something like this:
>>> a = np.array([1, 2, 3])
>>> n = a.size
>>> np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
array([[[1, 1],
[1, 2],
[1, 3]],
[[2, 1],
[2, 2],
[2, 3]],
[[3, 1],
[3, 2],
[3, 3]]])
Or as suggested by @Jaime you can get around 10x speedup if we take advantage of broadcasting here:
>>> a = np.array([1, 2, 3])
>>> n = a.size
>>> perm = np.empty((n, n, 2), dtype=a.dtype)
perm[..., 0] = a[:, None]
perm[..., 1] = a
...
>>> perm
array([[[1, 1],
[1, 2],
[1, 3]],
[[2, 1],
[2, 2],
[2, 3]],
[[3, 1],
[3, 2],
[3, 3]]])
Timing comparisons:
>>> a = np.array([1, 2, 3]*100)
>>> %%timeit
np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
...
1000 loops, best of 3: 934 µs per loop
>>> %%timeit
perm = np.empty((n, n, 2), dtype=a.dtype)
perm[..., 0] = a[:, None]
perm[..., 1] = a
...
10000 loops, best of 3: 111 µs per loop