When it comes to the specific values of the components of these arrays, that is something that will take care and time. But as to your question, about their shape, that is not so hard to answer.
We want these operators to preserve the shape of the original function. So, as you said, in one variable, if we have our function $f$ represented by a sampling at points $x_1,\dots,x_n$, it can be represented as a vector,
$$\begin{bmatrix}f(x_1) \\ \vdots \\ f(x_n)\end{bmatrix}\equiv\begin{bmatrix}f^1 \\ \vdots \\ f^n\end{bmatrix}\equiv \boldsymbol f$$
The derivative of a function $\Bbb R\to\Bbb R$ is still a function from $\Bbb R\to \Bbb R$, so whatever array operation we use to represent the derivative, it must be of the right shape so that it maps column vectors to column vectors. Clearly, a matrix is the right choice. In particular, if $\mathbf D$ is the symbol for the matrix representation of the derivative, then
$$(Df)^i=D^i{}_jf^j$$
You can see that by contracting a matrix ($(1,1)$ tensor) with a vector ($(1,0)$ tensor), we get a $(1,0)$ tensor. In particular,
$$\text{Contracting a}~ (a,b)~\text{tensor with a}~(c,d)~\text{tensor}~n~\text{times will give a }\\ (a+c-n,b+d-n)~\text{tensor}.$$
But what if we have a function $F:\Bbb R^2\to \Bbb R$ ? Now of course we need a matrix representation:
$$\begin{bmatrix}F(\boldsymbol x_{1,1})&\cdots&F(\boldsymbol x_{1,n}) \\ \vdots & \ddots & \vdots \\ F(\boldsymbol x_{n,1}) & \cdots & F(\boldsymbol x_{n,n})\end{bmatrix}\equiv\begin{bmatrix}F^1{}_1 & \cdots & F^1{}_n \\ \vdots & \ddots & \vdots \\ F^n{}_1 & \cdots & F^n{}_n\end{bmatrix}\equiv\mathbf F$$
Now, again, partial derivatives should preserve the shape of the function. Our function here is represented as a matrix or $(1,1)$ tensor, so we need an array operation that takes a $(1,1)$ tensor to a $(1,1)$ tensor. So now, instead of two indices, we need four. So we need something that looks like
$$(\partial F)^i{}_j=\partial^{il}{}_{jk}F^k{}_l$$
So for functions $\Bbb R^2\to \Bbb R$, the partial derivative operator is a $(2,2)$ tensor, or, in rather crude terms, a "matrix of matrices".
You can generalize this approach easily to any function from $\Bbb R^M$ to $\Bbb R^N$.