5
votes

I'm quite new to using LSTM in Pytorch, I'm trying to create a model that gets a tensor of size 42 and a sequence of 62.(so 62 tensor a of size 42 each). Which means that I have 62 tensors in a sequence. Each tensor is of size 42.(shape is [62,42]. Call this input tensor.

and I want to predict a tensor of 1 with a sequence of 8 (so size 1 tensor and 8 sequences) using this. Which means that there are 8 tensors in a sequence of size 1 each. Call this label tensor.

The connection between those tensors is this: Input tensor is made of columns: A1 A2 A3 ...... A42 While label tensor if more like: A3

What I’m trying to show is that if needed label tensor can be padded with zero in all places instead of the value of A3, so it can reach a length of 42.

How can I do this? since from what I'm reading from the Pytorch documentation I can only predict in the same ratio(1 point predict 1), while I want to predict from tensor of 42 with a sequence of 62 a tensor of 1 and sequence of 8. Is it doable? Do I need to pad the predicted tensor to size 42 from 1? Thanks!

a good solution will be using seq2seq for example

1
Can you clarify your question? Are you saying that each input consists of 42 sequences of length 62? And then each output would be 42 sequences of length 8?Evan Weissburg
@EvanWeissburg I’m saying that each sequence has a length of 42. I have 62 sequences each time. And I want to predict using all of this a tensor of size 8.user2323232
So you're aiming to predict 62 sequences of length 42 given a single tensor of length 8?Evan Weissburg
@EvanWeissburg The opposite, the label is this single tensor with size 8user2323232
How exactly do you expect to map between these 62 sequences of length 42 and the single vector of length 8?Evan Weissburg

1 Answers

3
votes

If I correctly understand your question, given a sequence of length 62 you want to predict a sequence of length 8, in the sense that the order of your outputs have an importance, this is the case if you are doing some time series forcasting). In that case using a seq2seq model will be a good choice, here is a tutorial for this link. Globbaly, you nedd to implement an encoder and a decoder, here is en exemple of such an implemtation:

class EncoderRNN(nn.Module):
    def __init__(self, input_dim=42, hidden_dim=100):
        super(EncoderRNN, self).__init__()
        self.hidden_size = hidden_size

        self.lstm = nn.LSTM(input_dim, hidden_dim)

    def forward(self, input, hidden):
        output, hidden = self.lstm(input, hidden)
        return output, hidden

    def initHidden(self):
        return torch.zeros(1, 1, self.hidden_size, device=device)


class DecoderRNN(nn.Module):
    def __init__(self, hidden_dim, output_dim):
        super(DecoderRNN, self).__init__()
        self.hidden_dim = hidden_dim

        self.lstm = nn.LSTM(hidden_dim, hidden_dim)
        self.out = nn.Linear(hidden_dim, output_dim)
        self.softmax = nn.LogSoftmax(dim=1)

   def forward(self, input, hidden):
        output, hidden = self.lstm(input, hidden)
        output = self.softmax(self.out(output[0]))
        return output, hidden

   def initHidden(self):
        return torch.zeros(1, 1, self.hidden_size, device=device)

If your the order of your 8 outputs has no importance, then you can simply add a Linear layer with 8 units after the LSTM layer. You can use this code directly in that case

  class Net(nn.Module):
      def __init__(self, hidden_dim=100, input_dim=42, output_size=8):
          super(Net, self).__init__()
          self.hidden_dim = hidden_dim

          self.lstm = nn.LSTM(input_dim, hidden_dim, batch_first=True)

          # The linear layer that maps from hidden state space to tag space
          self.fc = nn.Linear(hidden_dim, output_size_size)

      def forward(self, seq):
          lstm_out, _ = self.lstm(seq)
          output = self.fc(lstm_out)
          return output