2

I knew that, in the house price logistical regression problem, the weights and features represent the "importance" of factor or coefficients of feature variables respectively, then minimize LSR loss can get the value of coefficients, and question are:

  1. How does CNN doing bounding box regression?

I actually did a lot of googling to find an intuitive explanation, but no luck.

  1. What do features and weights represent for in BBR?

I think it couldn't be $T$, $L$, $W$, and $H$ because these absolute values will vary a lot due to distance/scale and perspective difference,but the ratio of $\frac{W}{H}$ is a reasonable feature (for me to understand) due to it is a relative value.

OmG
  • 1,219
  • 9
  • 19
Alex Luya
  • 133
  • 5

1 Answers1

1

It depends on the concrete model, but we can consider the most popular single-stage model for object detection: SSD.

It has a set of default bounding boxes (prior boxes), for each of them it predicts:

  • Probability distribution over the set of classes (this is a classification problem, solved with crossentropy loss)
  • Offsets for the default location of the prior box (x, y coordinates for the center) and its height/width. This is a regression problem optimized with smoothed L1 loss (localization loss):

Localization loss

The exact definition is a bit more complicated, as it includes variance and exponents, but the general idea is like this. Also, depending on the bounding box encoding type, we may predict offsets not for the (cx, cy, w, h), but for (xmin, ymin, xmax, ymax).

The predicted shape offsets for prior boxes are not absolute values, of course. These values are relative to the default size of the prior box. The exact formulas for decoding are:

enter image description here

Dmytro Prylipko
  • 836
  • 5
  • 10
  • Thanks, What I want to know 1)How is BBR done "intuitively" in CNN? 2)What are weights and features variables in BBR?Your answer is too high level or abstract for me. – Alex Luya Jan 21 '19 at 15:44