How to get the inference compute graph of the pytorch model?

Question

I want to hand write a framework to perform inference of a given neural network. The network is so complicated, so to make sure my implementation is correct, I need to know how exactly the inference process is done on device.

I tried to use torchviz to visualize the network, but what I got seems to be the back propagation compute graph, which is really hard to understand.

Then I tried to convert the pytorch model to ONNX format, following the instruction enter link description here, but when I tried to visualize it, it seems that the original layers of the model had been seperated into very small operators.

I just want to get the result like this

How can I get this? Thanks!