Here are two cases of interest; both will be families of diffeomorphisms of $I=[0,1]$ parameterized by $T=[0,1]$. For $\mu$ a Borel probability measure on $T$, denote by $A_\mu$ the averaging operator:
$$A_\mu:[T\to[I\to I]] \to [I\to I],\,\,\phi_\bullet\mapsto \left[x\mapsto \int_T \phi_t(x)\, d\mu(t)\right]$$
Ex.1 (Negative): Put $\mu=\dfrac{1}{2}\delta_0+\dfrac{1}{2}\delta_1$ so that $A_\mu(\phi_\bullet)(x)=\dfrac{1}{2}\phi_0(x)+\dfrac{1}{2}\phi_1(x)$ and consider
$$\phi_t(x)=\begin{cases}x&\text{, if } t\geq\dfrac{1}{2}\\1-x&\text{, if } t<\dfrac{1}{2}\end{cases}.$$
Then the averaged map will be constantly $\dfrac{1}{2}$.
(It might seem unnatural to consider an integrable family; however for differentiability that is all that is needed (together with $L^1$ domination of derivatives), essentially by way of a Dominated Convergence Theorem; see Thm.2.27 on p.56 of Folland's Real Analysis (2e) (integrability of the family and $L^1$ domination of the derivatives is sufficient). See also the discussions e.g. at When to differentiate under the integral sign?, Differentiating under the integral sign: requirements, Leibniz's Rule for differentiation under the integral. .)
Ex.2 (Positive): Now consider an anonymous $\mu$ and a family $\phi_\bullet: T\to \operatorname{Diff}^r_+(I)$ of orientation preserving $C^r$ diffeomorphisms of $I$ such that the unCurried map $\phi:T\times I\to I$ is continuous.
As each $\phi_t$ is orientation preserving, we have that $\phi_t(0)=0$, $\phi_t(1)=1$ and $\dfrac{\partial \phi}{\partial x}(t,x)>0$. Thus we have
$$A_\mu(\phi_\bullet)(0)=0, A_\mu(\phi_\bullet)(1)=1.$$
Further, again by "differentiation under the integral sign" we have that $A_\mu$ commutes with derivatives:
$$\dfrac{d(A_\mu(\phi_\bullet))}{dx} = A_\mu\left(\dfrac{\partial\phi_\bullet}{\partial x}\right).$$
(In particular, the average of a family of $C^r$ maps will be $C^r$.)
Written out explicitly we have
$$\dfrac{d}{dx} \int_T \phi_t(x)\, d\mu(t)= \int_T \dfrac{\partial\phi}{\partial x} (t,x) \, d\mu(t)>0,$$
so that $A_\mu(\phi_\bullet)$ is strictly increasing; thus is a $C^r$ diffeomorphism (see e.g. Global inverse using Inverse Function Theorem or extension/"globalization" of inverse function theorem ).
Here are some comments:
When $A_\mu(\phi_\bullet)$ is invertible; it might be interesting to compare $(A_\mu(\phi_\bullet))^{-1}$ to $A_\mu(\phi_\bullet^{-1})$. Here $\phi_\bullet^{-1}: t\mapsto \phi_t^{-1}$ is the inverse family. (A fun question: Given a continuous family $\phi_\bullet$, classify the probability measures $\mu$ on $T$ such that $d_{C^1}((A_\mu(\phi_\bullet))^{-1},A_\mu(\phi_\bullet^{-1}))=0$ (or is sufficiently small). Recall that Dirac measures are such measures.)
In Ex.2, the orientation preserving assumption is a wlog.
Ex.2 seems to generalize to arbitrary continuous families of (local or global) diffeomorphisms of the real line.
If (and only if) signed measures are allowed one can come up with negative examples of continuous (or better) families of orientation preserving diffeomorphisms of $I$ by using atomic measures by way of interpolation.
Despite the positive example the intuition I've declared in the comments above hasn't changed; the way I see it the average of diffeomorphisms will typically be non-invertible (in high dimensions; even when the family is continuous or better). In dimension one; orientation preservation prohibits the family to "turn too much"; which seems to be the most obvious obstruction for the average to be a diffeomorphism. A nice analog of the situation in higher dimensions is to consider a family of local diffeomorphisms of a Lie group. (One can also take an arbitrary manifold and embed it into some euclidean space to use addition; this seems more artificial though.)
Another interesting thing to consider would be the generalization of the setup where the probability measure on $T$ is also a family $\mu_\bullet: I\to \operatorname{Prob}(T)$ (one can also consider a family of densities to use with a predetermined measure). One again has an averaging operator $A(\mu_\bullet,\phi_\bullet): x\mapsto \int_T \phi_t(x)\, d\mu_x(t)$. Now it is significantly harder for the average to be invertible; still it is fun to think of bi-families $(\mu_\bullet,\phi_\bullet)$ whose average is a diffeomorphism.
Regarding references; I am not aware of any specific references addressing the averaging problem. However it seems to me the setup can be considered as a special case of "functional integrals" in the sense of Feynman. I'm not sure if the cases when averages of diffeomorphisms are diffeomorphisms are studied though. A good start would be Simon's Functional Integration and Quantum Physics.
Another direction with a similar setup is that of random dynamics. One may consider picking the diffeomorphism $\phi_t$ randomly according to $\mu(t)$ (or $\mu([0,t])$) and composing the successive resulting diffeomorphisms would produce the dynamics on the interval. In this area typically more information is needed (a stationary measure to begin with, for instance), and as far as I understand the main questions in this area are typically about the asymptotic properties of the random dynamics, as well as invariant structures one can associate to them. A good start would be Kifer's Ergodic Theory of Random Transformations.