Grover's algorithm works by iteration of the operator $$\tag{1}G:=U_S^\perp U_f=H^{\otimes n}U_0^\perp H^{\otimes n}.$$ I know how to write down the matrix for $U_S^\perp$ (the inversion about the average value of the amplitudes), and also how it is implemented in a circuit (essentially, a CNOT gate between $X$ and $H$ gates).
But what about $U_f$? In the theory, the oracle is defined by its action of the tensor product of the input and output registers as $$\tag{2}U_f|x\rangle |y\rangle=|x\rangle |y\oplus f(x)\rangle,$$ where $f$ returns $0$ for all input states but for a specified $x_0$, in which case it returns $1$. But more concretely, I have the following questions:
- how is the matrix form of $U_f$ for a generic $n$ derived (specifically for the first few qubits $n=1,2,3$), and
- what does its implementation in a circuit actually look like in terms of fundamental gates?