I don't know of a universal theory of all places where Bernoulli numbers arise, but Euler-Maclaurin summation explains many of their more down-to-earth occurrences.
The heuristic explanation (due to Lagrange) is as follows. The first difference operator defined by $\Delta f(n) = f(n+1)-f(n)$ and summation are inverses, in the same sense in which differentiation and integration are inverses. This just amounts to a telescoping series: $\sum_{a \le i < b} \Delta f(i) = f(b) - f(a)$.
Now by Taylor's theorem, $f(n+1) = \sum_{k \ge 0} f^{(k)}(n)/k!$ (under suitable hypotheses, of course). If we let $D$ denote the differentation operator defined by $Df = f'$, and $S$ denote the shift operator defined by $Sf(n) = f(n+1)$, then Taylor's theorem tells us that $S = e^D$. Thus, because $\Delta = S-1$, we have $\Delta = e^D - 1$.
Now summing amounts to inverting $\Delta$, or equivalently applying $(e^D-1)^{-1}$. If we expand this in terms of powers of $D$, the coefficients are Bernoulli numbers (divided by factorials). Because of the singularity at "$D=0$", the initial term involves antidifferentiation $D^{-1}$, i.e., integration. Thus, we have expanded a sum as an integral plus correction terms involving higher derivatives, with Bernoulli number coefficients.
Specifically,
$$
\sum_{a \le i < b} f(i) = \int_a^b f(x) \, dx + \sum_{k \ge 1} \frac{B_k}{k!} (f^{(k-1)}(b) - f^{(k-1)}(a)).
$$
(Subtracting the values at $b$ and $a$ just amounts to the analogue of turning an indefinite integral into a definite integral.)