optimize for model

2023-01-30

项目研究博客

Word count: 1.2k | Reading time≈ 6 min

HandOut

fruit image classification model

object detection model

palyGround

youtube - GitHub - tputflite

optimize book

optimize - youtube

about mAP blog

Analysis Hyparameter

根据初学者而言，训练卷积神经模型可以很简单也可以很复杂，简单是只要把参数代进去就好，而困难的是该如何调整这些参数到合适的位置，我将会分享我的学习内容并教您该如何优化自己的模型。

Model: SSD with Mobilenet v2 FPN-lite
training playform: colab
Type: object detection model

以上是我使用的模型并训练模型的平台。

Batch size

什么是Batch size 呢？

假设你有一个模型需要训练1000张图片，那一次性训练完1000张效率太慢了。所以我们可以分成一组10张，同时训练100组，这样效率就会快很多，而这个组里面到底应该设置多少张才能提高训练速度，并且提高准确率呢。

首先我们需要知道batch size 会影响什么，它会影响你的加速器的响应速度，以及准确率，那么影响的原因是什么呢？

由于加速器的内存不高，所以处理的图像不多：

例如一个小组内有32个人，但是这个汽车只能搭载16个人，所以剩下16个人不能上车，这就大大减少了效率，同时浪费了资源，原因是汽车是单程的。所以我们只需要设置一个小组只有16个人就好。太少人的话效率也太少，所以也可以设置尽量接近16个人每一组。

结论而言就是批量大小会直接影响验证集的性能。

我们无法通过计算来解决模型到底需要多大的batch size来得到最快的训练速度和最好的准确率。

我们已经知道了这个batch size不能太大也不能太小，我们总得有个临界点吧，所以我们可以训练小一些的数据来得到这个结果。

下面是我的常量和变量

总步数=1000步
变量 batch size

在训练之前可以通过palyGround 这个网站来知道步数超过临界点会发生什么。

我的dataset一共300张图片，训练240张，30张来验证，30张来测试。

第一次测试 batch_size =16

Num_step = 1000

batch size = 16

Epochs = 1

average per step = 0.325 s

spend 7 min37 second

第二次测试 batch_size =17

optimizer

batch size = 17

spend time 9min25second

loss_classification

localization_loss

regularzation_loss

total_loss

learning_rate

step_per_sec

第三次测试 batch_size=15

Batch_size = 15

Spend time = 7min 18 second

optimizer

classification_loss

localization_loss

regularization_loss

total loss

learning rate

steps_per_sec

map_results

第四次测试 batch_size =18

batch_size = 18

Num_step = 1000

Spend time = 8min 1 second

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

step_per_sec

mAP results

第五次测试 batch_size=14

batch_size = 14

Num_step = 1000

Spend_time = 6min 46 s

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP results

第六次测试 batch_size =19

batch size = 19

Spend time =9min 20 second

Num_step = 1000

loss/classification

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP

第七次测试 batch_size = 13

batch size = 13

spend time = 7min 10 sec

Num_step=1000

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

step_per_sec

map_results

第八次测试 batch_size =20

batch_size = 20

Spend time = 9min30sec

Loss/classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP_results

第九次测试 batch_size = 12

batch_size = 12

spend time = 6min 14sec

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAp_results

第十次测试 batch_size=21

Batch_size = 21

Spend time = 9min 12 sec

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP results

第十一次测试 batch_size = 22

batch _size = 22

spend time = 9min 26sec

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP results

第十二次测试 batch_size = 23

Spend time = 10.17

Total loss = 0.322

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

Batch_size = 24

spend time = 10.27

Step = 1000

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP_Results

Batch_size = 25

spend time = 10.45

Loss/classification_loss

localization_loss

regularization_loss

Loss/total_loss

learning_rate

![mAP_results](/Users/jessyhuang/Library/Application Support/typora-user-images/image-20230326175022555.png)

Batch_size = 26

spend time = 11.36

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

mAP_results

Batch_size = 27

spend time = 11.45

classification_loss

localization_loss

regularization_loss

total_loss

learning_rate

steps_per_sec

Conclusion

batch_size	Total_loss	spend time	mAP results(have been Converted TFlite)	learning_rate
12	0.41	6.14	25.73	0.075
13	0.38	7.1	48.28	0.075
14	0.355	6.46	42.57	0.075
15	0.34	7.18	49.62	0.075
16	0.36	7.37	48.97	0.075
17	0.348	9.25	51.34	0.075
18	0.323	8.01	51.16	0.075
19	0.35	9.2	40.76	0.075
20	0.34	9.26	33.26	0.075
21	0.325	9.12	42.33	0.075
22	0.3	9.26	52.9	0.075
23	0.322	10.17	46.67	0.075