0
votes

I'm developing an Android app that will hold a tensorflow-lite model for offline inference.

I know that it is impossible to completely avoid someone stealing my model, but I would like to make a hard time for someone trying it.

I thought to keep my .tflite model inside the .apk but without the weights of the top layer. Then, at execution time I could download the weights of the last layer and load it in memory.

So, if someone try to steal my model he would get a useless model because it couldn't be used due to the missing weights of the last layer.

  1. It is possible to generate a tflite model without the weights of the last layer?
  2. Is it possible load those weights in a already loaded model in memory?

This is how I loading my .tflite model:

    tflite = new Interpreter(loadModelFile(), tfliteOptions);

    // loads tflite grapg from file
    private MappedByteBuffer loadModelFile() throws IOException {
        AssetFileDescriptor fileDescriptor = mAssetManager.openFd(chosen);
        FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
        FileChannel fileChannel = inputStream.getChannel();
        long startOffset = fileDescriptor.getStartOffset();
        long declaredLength = fileDescriptor.getDeclaredLength();
        return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
    }
  1. Are there other approaches to make my model safer? I really need to make inference locally.
1

1 Answers

0
votes

If we are talking about Keras models ( or any other model in TF ), we can easily remove the last layer and then convert it to a TF Lite model with tf.lite.TFLiteConverter. That should not be a problem.

Now, in Python, get the last layer's weights and convert it to a nice JSON file. This JSON file could be hosted on cloud ( like Firebase Cloud Storage ) and can be downloaded by the app.

The weights could be parsed as an array() object. The actiavtions from the TF Lite model could be dot multiplied with the weights parsed from the JSON. Lastly, we apply an activation to provide predictions, which we need indeed!

The model is so precisely trained that it could be rarely used for any other use case. So, I think we do not need to worry about that.

Also, it will be better if we use some cloud hosting platforms, which use requests and APIs instead of directly loading a raw model.