2
votes

Define three row vectors, A = (1,2,3) B = (10,20,30,40) C = (100,200,300,400,500)

I want to construct a new matrix D which will be a have 3x4x5 = 60 elements and contains the averages of these elements as illustrated below:

D = 
(1+10+100)/3, (1+10+200)/3,…, (1+10+ 500)/3 \
(1+20+100)/3, (1+20+200)/3,…, (1+20+ 500)/3 \
(1+30+100)/3, (1+30+200)/3,…, (1+30+ 500)/3 \
(1+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \

(2+10+100)/3, (2+10+200)/3,…, (2+10+ 500)/3 \
(2+20+100)/3, (2+20+200)/3,…, (2+20+ 500)/3 \
(2+30+100)/3, (2+30+200)/3,…, (2+30+ 500)/3 \
(2+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \

(3+10+100)/3, (3+10+200)/3,…, (3+10+ 500)/3 \
(3+20+100)/3, (3+20+200)/3,…, (3+20+ 500)/3 \
(3+30+100)/3, (3+30+200)/3,…, (3+30+ 500)/3 \
(3+40+100)/3, (3+40+200)/3,…, (3+40+ 500)/3 \

The way it is set up in this example it will be a 12x5 matrix, but I am fine if it is a 1X60 vector or 60X1 vector.

How to do this efficiently in Mata? I am new to Mata and I had this running in Stata using multiple forval loops (in this case, there would be 3 forval loops). But this becomes very time consuming as I have up to 8 row vectors and about 120 elements in each of them.

I figured that I can use for loops in Mata and it will be much faster, but I believe if I can do this as a matrix manipulation instead of using for loops then it will be even faster. The problem is I am having a hard time visualizing how to write such a program (or if it's even possible) and any help would be highly appreciated.

2
8 vectors with 120 elements each will result in 120^8 = 4.300e+16 elements. Mata can hold a maximum of 2147483647*2147483647 = 4.612e+18 elements, so you're within the theoretical limit. But memory requirements to hold your matrix (assuming the final result is a row vector) is approx. 1*(120^8)*8 = 3.440e+17 bytes, which equal 320374965 Gigabytes. (See help [M-1] limits.) Am I missing something or am I doing the math incorrectly? This looks really difficult.Roberto Ferrer
Hi Roberto, your calculation seems right. So, maybe getting the D matrix using matrix manipulation is not really feasible for me. In reality, I need to compare [120^8] elements with a number (say, 100) and count how many of the elements are bigger than that number. So, when using the forval loop, I didn't have to worry about storing all the elements. I am inclined to use Mata to re-write the loops now. However, I would still appreciate if you could answer my questions for smaller vectors, so that I can learn the technique. Many thanks.ec27

2 Answers

1
votes

The clever solution by @AspenChen offers huge speed gains over for loops, as shown with some testing:

clear all
set more off

mata

timer_clear() 

//----- change data -----

fa = 250
fb = fa + 1
fc = fa + 2


//----- Method 1 -----

timer_on(1)

A = (1..fa)                 // 1 x fa
B = (1..fb)*10              // 1 x fb
C = (1..fc)*100             // 1 x fc

F = J(1, cols(A) * cols(B) * cols(C), .)
col = 0
for (i=1; i<=cols(A); i++) {
    for (j=1; j<=cols(B); j++) {
        for (k=1; k<=cols(C); k++) {

            col++
            F[1,col] = A[1,i] + B[1,j] + C[1,k]

        }   
    }
}

timer_off(1)


//----- Method 2 (Aspen Chen) -----

timer_on(2)

A = (1::fa)                 // fa x 1
B = (1::fb)*10              // fb x 1
C = (1::fc)*100             // fc x 1

// tensor sum for A and B
a = J(1,rows(B),1)          // 1 x fb with values of 1
b = J(1,rows(A),1)          // 1 x fa with values of 1
T = (A#a) + (b#B)'          // fa x fb

T = vec(T)                  // fa*fb x 1

// tensor sum for T and C
c = J(1,rows(T),1)          // 1 x fa*fb with values of 1
t = J(1,rows(C),1)          // 1 x fc with values of 1
T = (C#c) + (t#T)'          // fc x fa*fb

timer_off(2)

timer()

end

Resulting in:

timer report
  1.       8.78 /        1 =     8.776
  2.       .803 /        1 =      .803

If the original poster still wants to use for loops due the large number of elements that will be compared, (s)he can use something along the lines of:

<snip>

larger = 0
for (i=1; i<=cols(A); i++) {
    for (j=1; j<=cols(B); j++) {    
        for (k=1; k<=cols(C); k++) {

            larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)

        }   
    }
}

larger

<snip>

Edit

Further tests with for loops only:

clear all
set more off

mata

timer_clear() 

//----- change data -----

fa = 500
fb = fa + 1
fc = fa + 2


//----- Method 1 -----

timer_on(1)

A = (1..fa)                 // 1 x fa
B = (1..fb)*10              // 1 x fb
C = (1..fc)*100             // 1 x fc

larger = 0
for (i=1; i<=cols(A); i++) {
    for (j=1; j<=cols(B); j++) {
        for (k=1; k<=cols(C); k++) {

            larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)

        }   
    }
}

larger

timer_off(1)


//----- Method 2 (ec27) -----

timer_on(2)

A = (1..fa)                 // 1 x fa
B = (1..fb)*10              // 1 x fb
C = (1..fc)*100             // 1 x fc

larger = 0
for (i=1; i<=cols(A); i++) {
    for (j=1; j<=cols(B); j++) {
        for (k=1; k<=cols(C); k++) {

            placebo = A[1,i] + B[1,j] + C[1,k]
            if (placebo > 7) larger = larger + 1

        }   
    }
}

larger 

timer_off(2)

timer()

end
1
votes

For the smaller example, you are basically asking for the summation version of the Kronecker Product. I found this Matlab discussion thread on the subject, in which it is referred to as the Tensor Sum (did not see this phrase used very often).

Here's a quick attempt to replicate the operation in Mata. Not a very neat code, so feel free to edit or correct.

clear
mata
// input data
A=(1,2,3)'                  // 3x1
B=(10,20,30,40)'            // 4x1
C=(100,200,300,400,500)'    // 5x1

// tensor sum for A and B
a=J(1,4,1)  // 1x4 with values of 1
b=J(1,3,1)  // 1x3 with values of 1
T=(A#a)+(b#B)'
T           // 3x4
T=vec(T)    // 12x1

// tensor sum for T and C
c=J(1,12,1) // 1x12 with values of 1
t=J(1,5,1)  // 1x5 with values of 1
T=(C#c)+(t#T)'

// divide by 3
T=T/3
T'  // transposed just for better display
end