So, I want to perform text classification using xlnet.
I initialized the xlnet model and over that I added 2 layers (1 fc layer and 1 softmax):
from transformers import XLNetTokenizer, XLNetModel
from tensorflow.keras import Model
from tensorflow.keras import layers
import tensorflow as tf
def create_model():
tokenizer = XLNetTokenizer.from_pretrained('xlnet-large-cased')
xlnet_model = XLNetModel.from_pretrained('xlnet-large-cased')
text = ["Hello, my dog is cute","I'm a very happy","I'm sad"]
inputs = tokenizer(text, return_tensors="pt", padding= True)
outputs = xlnet_model(**inputs)
flatten = layers.Flatten()(outputs.last_hidden_state.detach().numpy())
fc1 = layers.Dense(units=256,activation="relu")(flatten)
softmax = layers.Dense(units=3, activation="softmax")(fc1)
txt_model = Model(inputs = tf.keras.Input(outputs.last_hidden_state.detach().numpy()) , outputs = softmax)
return xlnet_model, txt_model
def main():
xlnet_model, txt_model = create_model()
I intend to train the fc layer which I why I'm initializing the model and this is where the problem occurs. The input of the model would be the output of the last layer of xlnet and the output of the model would be output of the softmax layer.
I have trouble initializing the model input with the output of the last layer of xlnet.
My predictions for the problems (in code):
- Problems in Input to
txt_model
: I'm directly sending in the values (instead of sending something likemodel.layer.output
intf.keras.model
) butXLNetModel
intransformers
is in pytorch and returns a pytorch tensor. - Problems in output to
txt_model
: I believe that since I'm usingdetach
inoutput.last_hidden-state
, a error might rise up for not having metadata for previous layers. (I have usedetach
asoutput.last_hidden-state
is a pytorch tensor)
So, presuming my predictions for problems are right, I need the right method to initialize the txt_model
input. Give your suggestions.