0
votes

Good Morning, I am trying to progam the next simple function in SAS using proc iml, but I obtain the next error " not enough memory to store all matrices". I am trying to read two matrices one call "matriz_product" and the other one "matriz_segment", these table have a dimension of 21x(more than)1.000.000 and the values are characters. After reading this matrices I want to create one vector from each of the tables where the column picked is the one that is specified in position (another vector that I read).

The code is the following:

proc iml;
use spain.Tabla_product;
read all var {a_def_prdt1 b_def_prdt2 c_def_prdt3 d_def_prdt4 e_def_prdt5 f_def_prdt6 g_def_prdt7 h_def_prdt8 i_def_prdt9 j_def_prdt10 k_def_prdt11 l_def_prdt12 m_def_prdt13 n_def_prdt14 o_def_prdt15 p_def_prdt16 q_def_prdt17 r_def_prdt18 s_def_prdt19 t_def_prdt20} into matrizProduct;
use spain.Tabla_segment;
read all var {a_def_sgmt1 b_def_sgmt2 c_def_sgmt3 d_def_sgmt4 e_def_sgmt5 f_def_sgmt6 g_def_sgmt7 h_def_sgmt8 i_def_sgmt9 j_def_sgmt10 k_def_sgmt11 l_def_sgmt12 m_def_sgmt13 n_def_sgmt14 o_def_sgmt15 p_def_sgmt16 q_def_sgmt17 r_def_sgmt18 s_def_sgmt19 t_def_sgmt20} into matrizsegment;
use spain.contratonodato;
read all var {posi} into position;
n=nrow(matrizsegment);
DEF_PRDT=j(n,1,"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz");
DEF_SGMT=j(n,1,"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz");
do i =1 to n;
DEF_PRDT[i,1]=matrizproduct[i,position[i]];
DEF_SGMT[i,1]=matrizsegment[i,position[i]];
end;
create contratosnodato_modi var {"DEF_SGMT" "DEF_PRDT"};
append;
run;

Thank you very much.

1

1 Answers

0
votes

Base SAS reads row-by-row and so rarely runs out of memory. Proc IML reads the entire dataset into memory and so can easily run out of memory for larger datasets. For this reason, I only use proc IML when absolutely necessary (e.g. doing matrix multiplication), and when I do I will:

  1. Chunk datasets into smaller pieces that will fit into memory, and do these sequentially.

  2. Optimise algorithms to be able to run within constraints - for example, exploit the structure of a matrix I need to invert to avoid inverting the whole matrix.

Fortunately in this case you don't even appear to need proc IML at all - what you're trying to do can be done in a data step. Try this:

data contratosnodato_modi;
  format DEF_PRDT $40. DEF_SGMT $40.;
  set spain.Tabla_product;
  set spain.Tabla_segment;
  set spain.contratonodato;
  array product {20} a_def_prdt1 b_def_prdt2 c_def_prdt3 d_def_prdt4 e_def_prdt5 f_def_prdt6 g_def_prdt7 h_def_prdt8 i_def_prdt9 j_def_prdt10 k_def_prdt11 l_def_prdt12 m_def_prdt13 n_def_prdt14 o_def_prdt15 p_def_prdt16 q_def_prdt17 r_def_prdt18 s_def_prdt19 t_def_prdt20;
  array segment {20} a_def_sgmt1 b_def_sgmt2 c_def_sgmt3 d_def_sgmt4 e_def_sgmt5 f_def_sgmt6 g_def_sgmt7 h_def_sgmt8 i_def_sgmt9 j_def_sgmt10 k_def_sgmt11 l_def_sgmt12 m_def_sgmt13 n_def_sgmt14 o_def_sgmt15 p_def_sgmt16 q_def_sgmt17 r_def_sgmt18 s_def_sgmt19 t_def_sgmt20;
  DEF_PRDT = product{posi};
  DEF_SGMT = segment{posi};
  keep DEF_PRDT DEF_SGMT;
run;

Here I'm reading all data in at once, storing the columns of interest as arrays and only accessing the columns specified in the position dataset.