Converting Mnist array data to one-hot encoded arrays for each pixel value

Question

I'm attempting to turn a matrix of normalized pixel data from mnist into a matrix of one-hot arrays, representing the y_val of each pixel based on class & if it's greater than 0.

Here's an example of the input, more specifically a single row from a single example in the dataset:

[[0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.01176471]
 [0.07058824]
 [0.07058824]
 [0.07058824]
 [0.49411765]
 [0.53333336]
 [0.6862745 ]
 [0.10196079]
 [0.6509804 ]
 [1.        ]
 [0.96862745]
 [0.49803922]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]]

A 28 element array of pixel brightness values. Now, let's say that this input's real class is of the number "2". The script would go through and insert a one-hot array where each pixel value currently sits.

If the pixel is == 0, then it would be converted from

[0.        ]

to

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

As I'm using the 11th entry to represent a blank pixel.

If the array's element's value was > 0, it would insert a one-hot array of the examples true label, a "2".

[0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

This is the code I have so far

#copying the shape of the train data matrix of arrays into a new
#array
new_y_train = np.copy(x_train)

for ex_index, example in enumerate(x_train):
  for row_index, row in enumerate(example):
    for px_index, pixel in enumerate(row):
      temp_arr = np.zeros(11)
      if pixel > 0:
        pixel = np.insert(temp_arr, np.argmax(y_train[index]), 1)
        new_y_train[ex_index][row_index][px_index] = pixel
      else:
        pixel = np.insert(temp_arr, 11, 1)
        new_y_train[ex_index][row_index][px_index] = pixel

However, when I get to this line

new_y_train[ex_index][row_index][px_index] = pixel

I get an error about broadcasting the array to a pixel value, "could not broadcast input array from shape (12) into shape (1)"

Not sure how to resize or modify the array to allow for the input of this new one-hot array into the matrix.

Any help would be great!

Dev Khadka Dev Khadka · Accepted Answer · 2019-09-27T15:05:33

you can do it using fancy indexing

arr = np.array([[0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.01176471],
 [0.07058824],
 [0.07058824],
 [0.07058824],
 [0.49411765],
 [0.53333336],
 [0.6862745 ],
 [0.10196079],
 [0.6509804 ],
 [1.        ],
 [0.96862745],
 [0.49803922],
 [0.        ],
 [0.        ],
 [0.        ],
 [0.        ],],)

labels = np.random.choice(10, len(arr)).reshape(-1,1)

ind_row = np.arange(len(arr))
ind_col = np.where(arr>0, labels, 10).ravel()

one_hot_coded_arr = np.zeros((len(arr), 11))
one_hot_coded_arr[ind_row,ind_col]=1

one_hot_coded_arr

edit

if you have 10 image data then conversion to your desired shape can be done like this

data = np.random.choice(255, (10,28,28))
labels = np.random.choice(9, (10,1))

## do you calculation of brightness here
## and expand it to one row per pixel
arr = data.reshape(-1,1)/255
## repeat labels to match the expanded pixel
labels = labels.repeat(28*28).reshape(-1,1)

ind_row = np.arange(len(arr))
ind_col = np.where(arr>0, labels, 10).ravel()

one_hot_coded_arr = np.zeros((len(arr), 11))
one_hot_coded_arr[ind_row,ind_col]=1

## convert back to desired shape
one_hot_coded_arr.reshape(-1, 28,28,11)

Converting Mnist array data to one-hot encoded arrays for each pixel value

1 Answers