I will give you a magic antidote definition which will cover derivative in a vastly different type of uses. The thing you need to understand is that of the Frechet Derivative.
Let $V$ and $W$ be normed vector spaces, and $U \subset V$ be an open subset of $V$. A function $f: U \to W$ is called Frechet differentiable at $x \in U$ if there exists a bounded linear operator $A: V \to W$ s.t
$$ \lim_{ || h || \to 0 } \frac{||f(x+h) - f(x) - Ah||_W} {||h||_V}=0$$
Most probably, you will not know the definition of these words:
- Normed vector space: Think of it as a vector space where we it makes sense to talk about lengths.
- Bounded linear operator: The ratio of length of mapped vector divided by ratio of length of vector in the domain is less than some $M$ . Note that the notion of length in map and in that of domain ened not be same.
What's the motivation here?
In the single variable case, we have the taylor expansion as:
$$ f(x+h) = f(x) + f'(x) h + o(h^2)$$
The idea is that the term which is coefficient of $h$ in expansion is the derivative. Now, we can rearrange this as:
$$ \frac{ f(x+h) - f(x) - f'(x) h}{h} = o(h)$$
As $ h \to 0 $ both side should go to zero if $f'(x)$ is well defined.
Now, similarly in the multivariable case, we want it that if we add a small displacement $\epsilon v$ in domain, how the function change, for that we can again use the idea:
$$ f(x+v) = f(x) + Av + o(\epsilon^2)$$
Now rearrange and we get the same concept.