From the syntactic point of view, the difference between the $G\times X\to X$ formalism and the $G\to\text{Bij}(X)$ formalism is that the former is often more convenient to describe analytical properties (e.g., assuming further that $G$ and $X$ are topological objects, that the action is continuous, or differentiable if a smooth structure is available), whereas the latter is more convenient to describe algebraic properties (e.g. consider writing the axioms for a group action using $G\times X\to X$ instead of simply saying that one has a group homomorphism $G\to\text{Bij}(X)$). As an example to $X$ having further algebraic structure and the latter formalism being more useful, consider $X$ to be a vector space, and $G$ acting by linear maps. Simply saying that one has a group homomorphism $G\to \text{GL}(X)$ is sufficient to describe this situation. Of course in considering $\text{GL}(X)$ and what it means to have a group homomorphism into $\text{GL}(X)$ one eventually again resorts to the lengthy descriptions, so eventually the difference is a matter of packaging.
See also the discussions at How a group represents the passage of time? and What does a local flow tell us that an integral curve does not? for further details along these lines and more specific examples.