2

I just start learning Faster R-CNN and I have some doubts about the optimizer of this network. In my understanding, Adam optimizer performs much better than SGD in a lot of networks. However, the paper of Faster R-CNN choose SGD optimizer instead of Adam and a lot of implementations of Faster R-CNN I found on github use SGD as optimizer as well.

I guess that in case for Faster R-CNN Adam maybe doesn't have a better performance. After I looked up for my guessing, I found this answer link that gave me a rough idea. In the answer, it suggests that SGD is a better generalized adapter than ADAM. But I still need some more detailed explannations about it.

Here is my questions:

  1. Can we use Adam as optimizer for Faster R-CNN? If someone has used Adam for Faster R-CNN, could you share some results about Adam's performance?
  2. As the answer in the link above suggests, Adam may have worst performance in some special cases. I would like to ask in what kinds of special cases will Adam perform poorly. Can anyone gives me some examples? And does Faster R-CNN belongs to these special cases?
icebear
  • 21
  • 1
  • 1
  • 6

1 Answers1

0

Great question, and I'm also very curious, all I can point towards is this mini Weights and Biases report which shows Adam and AdamW outperforming SGD.

https://wandb.ai/ap-wt/great-barrier-reef/reports/FasterRCNNs-with-different-optimizers--VmlldzoxNDY0MjI0

I'm under the impression people tend to consider Adam safer / more robust. While proper optimisation of SGD can often squeeze out slightly higher performance. Not sure how much of that applies specifically to FasterRCNN though.