3
votes

I have this code for semantic search engine built using the pre-trained bert model. I want to convert this model into tflite for deploying it to google mlkit. I want to know how to convert it. I want to know if its even possible to convert this into tflite. It might be because its mentioned on the official tensorflow site :https://www.tensorflow.org/lite/convert. But I dont know where to begin

Code:


from sentence_transformers import SentenceTransformer

# Load the BERT model. Various models trained on Natural Language Inference (NLI) https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/nli-models.md and 
# Semantic Textual Similarity are available https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/sts-models.md

model = SentenceTransformer('bert-base-nli-mean-tokens')

# A corpus is a list with documents split by sentences.

sentences = ['Absence of sanity', 
             'Lack of saneness',
             'A man is eating food.',
             'A man is eating a piece of bread.',
             'The girl is carrying a baby.',
             'A man is riding a horse.',
             'A woman is playing violin.',
             'Two men pushed carts through the woods.',
             'A man is riding a white horse on an enclosed ground.',
             'A monkey is playing drums.',
             'A cheetah is running behind its prey.']

# Each sentence is encoded as a 1-D vector with 78 columns
sentence_embeddings = model.encode(sentences)

print('Sample BERT embedding vector - length', len(sentence_embeddings[0]))

print('Sample BERT embedding vector - note includes negative values', sentence_embeddings[0])

#@title Sematic Search Form

# code adapted from https://github.com/UKPLab/sentence-transformers/blob/master/examples/application_semantic_search.py

query = 'Nobody has sane thoughts' #@param {type: 'string'}

queries = [query]
query_embeddings = model.encode(queries)

# Find the closest 3 sentences of the corpus for each query sentence based on cosine similarity
number_top_matches = 3 #@param {type: "number"}

print("Semantic Search Results")

for query, query_embedding in zip(queries, query_embeddings):
    distances = scipy.spatial.distance.cdist([query_embedding], sentence_embeddings, "cosine")[0]

    results = zip(range(len(distances)), distances)
    results = sorted(results, key=lambda x: x[1])

    print("\n\n======================\n\n")
    print("Query:", query)
    print("\nTop 5 most similar sentences in corpus:")

    for idx, distance in results[0:number_top_matches]:
        print(sentences[idx].strip(), "(Cosine Score: %.4f)" % (1-distance))
3

3 Answers

0
votes

First of all, you need to have your model in TensorFlow, the package you are using is written in PyTorch. Huggingface's Transformers has TensorFlow models that you can start with. In addition, they also have TFLite-ready models for Android.

In general, you have a TensorFlow model first. Them, save it in the SavedModel format:

tf.saved_model.save(pretrained_model, "/tmp/pretrained-bert/1/")

You can run the converter on this.

0
votes

Have you tried run the convert tool (tflite_convert) and did it complain anything?

BTW, you may want to check out the QA example from TFLite team which use a Bert Model. https://github.com/tensorflow/examples/tree/master/lite/examples/bert_qa/android

0
votes

I couldn't find any information about using a BERT model to obtain document embeddings on mobile and compute a k-nearest documents search, as in your example. It might also not be a good idea because BERT models can be expensive to execute and have a large number of parameters so a large model file size (400mb+) as well.

However, you can now use BERT and MobileBERT for text classification and question answering on mobile. Maybe you can start with their demo app which interfaces with a MobileBERT tflite model, as Xunkai mentioned. I am sure in the near future there will be better support for your use case.