Trying to understand the multivariable proof of the least square estimations. I have difficulty differentiating the loss function with regards to $\vec{\beta}$. \begin{align} \frac{\partial L\left(D, \vec{\beta}\right)}{\partial\vec{\beta}} &= \frac{\partial \left(Y^\textsf{T}Y - Y^\textsf{T}X\vec{\beta} - \vec{\beta}^\textsf{T}X^\textsf{T}Y + \vec{\beta}^\textsf{T}X^\textsf{T}X\vec{\beta}\right)}{\partial \vec{\beta}} \\ &= -2X^\textsf{T}Y + 2X^\textsf{T}X\vec{\beta} \end{align}
I am stucked when we have to derive with regards to $\beta$ a term containing $\beta^T$. How should that deal be with ?