First of all let me tell you that the answer to this question is likely to confirm a not-so-minor error in a very popular (and excellent) textbook on optimization, as you'll see below.
Background
Suppose that we have a real-valued function $f(X)$ whose domain is the set of $n\times n$ nonsingular symmetric matrices. Clearly, $X$ does not have $n^2$ independent variables; it has $n(n+1)/2$ independent variables as it's symmetric. As is well known, an important use of Taylor expansion is to find the derivative of a function by finding the optimal first-order approximation. That is, if one can find a matrix $D \in \mathbb{R}^{n\times n}$ that is a function of $X$ and satisfies
$$f(X+V) = f(X) + \langle D, V \rangle + \text{h.o.t.}, $$ where $\text{h.o.t.}$ stands for higher-order terms and $\langle \cdot, \cdot \rangle$ is inner product, then the matrix $D$ is the derivative of $f$ w.r.t. $X$.
Question
Now my question is: What is the right inner product $\langle \cdot, \cdot \rangle$ to use here if the matrix is symmetric? I know that if the entries of $X$ were independent (i.e., not symmetric), then the $\text{trace}$ operator would be the correct inner product. But I suspect that this is not true in general for a symmetric matrix. More specifically, my guess is that even if the $\text{trace}$ operator would lead to the correct expansion in the equation above, the $D$ matrix that comes as a result won't give the correct derivative. Here is why I think this is the case.
A while ago, I asked a question about the derivative of the $\log\det X$ function, because I suspected that the formula in the book Convex Optimization of Boyd & Vandenberghe is wrong. The formula indeed seems to be wrong as the accepted answer made it clear. I tried to understand what went wrong in the proof in the Convex Optimization book. The approach that is used in the book is precisely the approach that I outlined above in Background. The authors show that the first-order Taylor approximation of $f(X)=\log\det X$ for symmetric $X$ is $$ f(X+V) \approx f(X)+\text{trace}(X^{-1}V). $$
The authors prove this approximation by using decomposition specific to symmetric matrices (proof in Appenix A.4.1; book is publicly available). Now this approximation is correct but $X^{-1}$ is not the correct derivative of $\log\det X$ for symmetric $X$; the correct derivative is $2X^{-1}-\text{diag}(\text{diag}(X^{-1}))$. Interestingly, the same approximation in the formula above holds for nonsymmetric invertible matrices too (can be shown with SVD decomposition), and in this case it does give the right derivative because the derivative of $\log\det X$ is indeed $X^{-T}$ for a matrix with $n^2$ independent entries. Therefore I suspect that $\text{trace}$ is not the right inner product $\langle \cdot, \cdot \rangle$ for symmetric matrices, as it ignores the fact that the entries of $X$ are not independent. Can anyone shed light on this question?
Added: A simpler question
Based on a comment, I understand that the general answer to my question may be difficult, so let me ask a simpler question. The answer to this question may be sufficient to show what went wrong in the proof in the Convex Optimization book.
Suppose $g(X)$ is a function $g: \mathbb{R}^{n\times n} \to \mathbb R$. Is it true that the first-order Taylor approximaton with trace as inner product, i.e.,
$$g(X+V) \approx g(X) + \text{trace}\left( \nabla g (X)^T V \right), $$
implicitly assumes that the entries of $X$ are independent? In other words, is it true that this approximation may not hold if entries of $X$ are not independent (e.g., if $X$ is symmetric)?