Mate, I need your help. I aim to alter my CNN model into RNN model. First, I load my image tensor:
labels = pickle.load(open("./labels.p", "rb"))
print("1")
print(labels)
print("2")
print(type(labels))
print(labels.shape)
Below is the output:
1
[[ 902.69956724 512.52732211 12.54330104]
[1145.09702932 401.66131612 10.68199206]
[ 461.22967364 56.20169521 8.11038109]
[1280.78974907 665.87978551 7.54409773]
[ 884.68210632 480.90786089 14.95856894]
[1210.83786034 328.45081132 13.05409176]
[1438.93924358 127.49080945 18.27168386]
[1281.81584308 95.668115 7.62392635]
[ 337.53575975 803.34024046 10.60111608]
[1481.01890124 439.19175118 13.1070263 ]
[ 215.65222765 687.71369479 18.74468127]
[ 269.08710447 472.80278553 13.80920512]
[ 948.32277166 348.69578896 10.7822973 ]
[1474.65892421 163.96707338 18.45881795]
[ 292.23754158 149.22307508 13.4992139 ]
[ 136.9471735 278.05769081 9.12228086]
[ 985.74851539 588.16808794 15.61136775]
[ 820.15600586 413.44663686 7.33000344]
[ 600.40973646 224.73483946 15.34583232]
[ 989.19660841 561.69562362 13.54216196]
[1000.15931758 663.87920314 11.15197741]
[ 298.12969626 167.39119793 5.35186742]
[ 638.61253698 295.83490355 9.3218228 ]
[1223.90900603 888.44809641 6.684419 ]
[1201.89311595 749.11837266 14.4013575 ]
[ 937.86739849 652.09623989 16.44335687]
[ 162.47729223 463.31552105 11.75272485]
[1156.98025949 615.87893056 14.29241447]
[1009.83765282 165.71673262 17.80982335]
[ 705.35752704 819.24557476 14.42351445]
[1037.50829126 159.56129246 13.29909752]
[1028.36148773 260.52347256 6.64187257]
[ 597.10138934 835.7720793 17.13412845]
[ 768.91368905 836.91912098 11.3426373 ]
[ 460.83668559 769.8292998 6.34995396]
[ 994.04975288 253.57209883 19.5308339 ]
[ 895.41331805 280.30494414 17.13954225]
[ 511.1852535 139.21590627 7.56179778]
[ 435.40744864 952.78539745 18.65784566]
[1271.17155467 463.45098885 19.6584129 ]
[ 149.09007975 397.47032936 15.11780791]
[ 997.12240755 637.36302863 6.29461804]
[ 332.88548391 658.82651389 6.12252151]
[1042.11968461 375.38079434 12.28855765]
[ 705.70382871 166.88958859 13.83288034]
[1445.74603852 814.76523232 10.99454478]
[ 257.574952 166.86709416 17.1052005 ]
[1201.92362302 665.70493243 12.25584347]
[1390.86751896 427.08727019 17.86179994]
[1134.12356525 776.99614606 15.24974708]
[1239.4344012 749.66481108 15.93442116]
[1312.74137859 972.81737253 15.77331154]
[ 996.00292311 432.82690562 16.54539616]
[ 485.73734914 748.98481509 16.47033807]
[ 225.05390585 801.77953762 6.25199535]
[ 719.07339038 558.49059786 6.43030475]
[ 288.87950534 294.97441026 13.02183236]
[ 833.35657913 520.77988763 18.52122489]
[ 80.57338018 827.11187278 8.70100782]
[ 98.64616045 795.62446572 16.84380171]
[1026.83986177 294.71529913 6.88628891]
[1422.67546814 668.23639302 9.48665262]
[1081.78577113 306.63881707 15.74209534]
[ 327.69665086 350.56995892 11.99900411]
[ 393.97635096 542.1259421 13.33891976]
[ 369.07280668 710.05765754 6.47136363]
[ 211.04899084 361.80913397 12.22177137]
[1452.62867746 540.28274757 14.40846748]
[1024.82270684 949.89106339 19.58472306]
[ 855.66223478 352.35966078 5.60886187]
[1233.07514824 690.26435986 5.80422432]
[1481.676745 144.53939859 17.86730875]
[ 116.90898102 200.03546528 12.06906204]
[ 344.96838994 647.59487088 17.90802996]
[ 198.49601919 561.5796024 11.62667088]
[1473.50407692 823.61023589 13.1372917 ]
[1453.83133544 892.69288604 19.5555176 ]
[1223.66982193 700.47249608 6.99368812]
[ 176.35675008 127.33238222 8.39737645]
[ 173.14195439 106.58526168 10.20118347]
[1270.79511303 389.32797878 19.63275348]
[1307.776142 973.10077476 19.95083908]
[ 470.56947482 850.03911329 9.50662325]
[ 536.50755789 814.16742226 18.83865899]
[ 340.29335666 363.27534976 13.08864079]
[1036.40320277 490.11110882 13.24101537]
[ 942.87072899 654.04747436 9.09688255]
[ 743.42528406 191.94994057 11.63154405]
[ 656.05725683 252.73054433 6.99323612]
[ 828.21065045 786.47618832 16.03489754]
[ 444.30867675 134.33513505 19.47964341]
[ 634.46988235 382.26825509 15.48071222]
[ 651.90231919 349.13707809 11.68690785]
[ 798.32702908 764.88900343 10.66210542]
[1217.4519886 721.94313243 14.85203151]
[ 58.28239437 700.73885755 7.68089927]
[ 578.67205191 778.34479309 16.07327847]
[ 276.52791372 605.71030874 17.40231962]
[1484.96952853 487.82282634 19.12038912]
[1467.77484467 241.84709196 9.10076222]]
2
<class 'numpy.ndarray'>
(100, 3)
Then, I load the image tensor:
fetched_image_list = pickle.load(open("./image_list/STFT_image_list.p", "rb"))
fetched_image_list = tf.convert_to_tensor(fetched_image_list)
print("3")
print(fetched_image_list.shape)
print(type(fetched_image_list))
Bellow is output of image tensors:
3
(100, 128, 128, 3)
<class 'tensorflow.python.framework.ops.EagerTensor'>
Nest, I assembled the dataset as below:
dataset = tf.data.Dataset.from_tensor_slices((fetched_image_list, labels))
dataset = dataset.batch(32)
print("4")
print(dataset)
Below display the output:
4
<BatchDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float64, name=None), TensorSpec(shape=(None, 3), dtype=tf.float64, name=None))>
My CNN structure is:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), strides=(2, 2), dilation_rate=(1,1), input_shape=(128, 128, 3), activation='relu'),
tf.keras.layers.Conv2D(64, (3, 3), strides=(2, 2), padding="same", dilation_rate=(1,1), activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2, 2), dilation_rate=(1,1),activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2, 2), dilation_rate=(1,1),activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2, 2), padding="same", dilation_rate=(1, 1), activation='relu'),
tf.keras.layers.Conv2D(128, (3, 3), strides=(2, 2), padding="same", dilation_rate=(1, 1), activation='relu'),
tf.keras.layers.Dropout(0.30),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(3)
])
For CNN, I know the input shape is as simple as (width, height, channels). In this case, the input_shaoe is (128, 128, 3).
However, when building LSTM, the configuration gets complexed.
This is the ConvLSTM2D layer I built.
model = tf.keras.Sequential([
tf.keras.layers.ConvLSTM2D(filters= 32, kernel_size=3, input_shape=(128, 128, 3), return_sequences=True),
tf.keras.layers.Dropout(0.30),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(label_size)
])
When I tun the code, the error popped out:
ValueError: Input 0 of layer "conv_lstm2d" is incompatible with the layer: expected ndim=5, found ndim=3. Full shape received: (128, 128, 3)
So,
- How can I reshape this image to fit the LSTM layer?
- How can I create a LSTM layer to implement the image classification task.
- Is there anything I should know to make this job done?
Many Thanks.