3
votes

i tried to port this python implementation of a continuous RBM to Matlab: http://imonad.com/rbm/restricted-boltzmann-machine/

I generated 2-dimensional trainingdata in the shape of a (noisy) circle and trained the rbm with 2 visible an 8 hidden layers. To test the implementation i fed uniformly distributed randomdata to the RBM and plotted the reconstructed data (Same procedure as used in the link above).

Now the confusing part: With trainingdata in the range of (0,1)x(0,1) i get very satisfying results, however with trainingdata in range (-0.5,-0.5)x(-0.5,-0.5) or (-1,0)x(-1,0) the RBM reconstructs only data in the very right top of the circle. I dont understand what causes this, is it just a bug in my implementation i dont see?

Some plots, the blue dots are the training data, the red dots are the reconstructions.

Here is my implementation of the RBM: Training:

maxepoch = 300;
ksteps = 10;
sigma = 0.2;        % cd standard deviation
learnW = 0.5;       % learning rate W
learnA  = 0.5;      % learning rate A
nVis = 2;           % number of visible units
nHid = 8;           % number of hidden units
nDat = size(dat, 1);% number of training data points
cost = 0.00001;     % cost
moment = 0.9;      % momentum
W = randn(nVis+1, nHid+1) / 10; % weights
dW = randn(nVis+1, nHid+1) / 1000; % change of weights
sVis = zeros(1, nVis+1);    % state of visible neurons
sVis(1, end) = 1.0;         % bias
sVis0 = zeros(1, nVis+1);   % initial state of visible neurons
sVis0(1, end) = 1.0;        % bias
sHid = zeros(1, nHid+1);    % state of hidden neurons
sHid(1, end) = 1.0;         % bias
aVis  = 0.1*ones(1, nVis+1);% A visible
aHid  = ones(1, nHid+1);    % A hidden
err = zeros(1, maxepoch);
e = zeros(1, maxepoch);
for epoch = 1:maxepoch
    wPos = zeros(nVis+1, nHid+1);
    wNeg = zeros(nVis+1, nHid+1);
    aPos = zeros(1, nHid+1);
    aNeg = zeros(1, nHid+1);
    for point = 1:nDat
        sVis(1:nVis) = dat(point, :);
        sVis0(1:nVis) = sVis(1:nVis); % initial sVis
        % positive phase
        activHid;
        wPos = wPos + sVis' * sHid;
        aPos = aPos + sHid .* sHid;
        % negative phase
        activVis;
        activHid;
        for k = 1:ksteps
            activVis;
            activHid;
        end
        tmp = sVis' * sHid;
        wNeg = wNeg + tmp;
        aNeg = aNeg + sHid .* sHid;
        delta = sVis0(1:nVis) - sVis(1:nVis);
        err(epoch) = err(epoch) + sum(delta .* delta);
        e(epoch) = e(epoch) - sum(sum(W' * tmp));
    end
    dW = dW*moment + learnW * ((wPos - wNeg) / numel(dat)) - cost * W;
    W = W + dW;
    aHid = aHid + learnA * (aPos - aNeg) / (numel(dat) * (aHid .* aHid));
    % error
    err(epoch) = err(epoch) / (nVis * numel(dat));
    e(epoch) = e(epoch) / numel(dat);
    disp(['epoch: ' num2str(epoch) ' err: ' num2str(err(epoch)) ...
    ' ksteps: ' num2str(ksteps)]);
end
save(['rbm_' filename '.mat'], 'W', 'err', 'aVis', 'aHid');

activHid.m:

sHid = (sVis * W) + randn(1, nHid+1);
sHid = sigFun(aHid .* sHid, datRange);
sHid(end) = 1.; % bias

activVis.m:

sVis = (W * sHid')' + randn(1, nVis+1);
sVis = sigFun(aVis .* sVis, datRange);
sVis(end) = 1.; % bias

sigFun.m:

function [sig] = sigFun(X, datRange)
    a = ones(size(X)) * datRange(1);
    b = ones(size(X)) * (datRange(2) - datRange(1));
    c = ones(size(X)) + exp(-X);
    sig = a + (b ./ c);
end

Reconstruction:

nSamples = 2000;
ksteps = 10;
nVis = 2;
nHid = 8;
sVis = zeros(1, nVis+1);    % state of visible neurons
sVis(1, end) = 1.0;         % bias
sHid = zeros(1, nHid+1);    % state of hidden neurons
sHid(1, end) = 1.0;         % bias
input = rand(nSamples, 2);
output = zeros(nSamples, 2);
for sample = 1:nSamples
    sVis(1:nVis) = input(sample, :);
    for k = 1:ksteps
        activHid;
        activVis;
    end
    output(sample, :) = sVis(1:nVis);
end
2
So I was trying to replicate your error please can you give me the value of datarange variable in Undefined function or variable 'datRange'. Error in activHid (line 2) Also dont we need a target vector to train the input vectors or is it unsupervised learningSiddhartha Agarwal

2 Answers

2
votes

RBM's were originally designed to work only with binary data. But also work with data between 0 and 1. Its part of the algorithm. Further reading

2
votes

As input is in the range of [0 1] for both x and y, this is why they stay in that ares. Changing the input to input = (rand(nSamples, 2)*2) -1; results in input sampled from a range of [-1 1] and therefore the red dots will be more spread out around the circle.