One such example is $\ce{[Mn(CN)$_6$]^3-}$ with $\mathrm{3d^4}$ electrons. The electrons of the six CN ligands populate the $\mathrm{d^2sp^3}$ hybridized orbitals giving octahedral geometry. That's how it can be understood with the valence bond theory (VBT). To answer your question, you can switch to a different theory. You can imagine the five d orbital split into two groups, i.e., $t_{2g}$ and $e_g$, according to crystal field theory (CFT). Then the $\mathrm{3d^4}$ electrons are filled into the $t_{2g}$ orbitals. Usually people write it as spin-up and spin-down in the leftmost instead of the rightmost box, and then two unpaired electrons in the next two boxes of the $t_{2g}$, where they can be either both spin-ups or both spin-downs. You can just treat this as a result of aufbau principle. I know this does not directly answer your question about spin flip. However, it provides an alternative way of understanding so that you would not have that question.
Please note that there are different types of theories being applied to understand the bonding in coordination compounds, including VBT, CFT, and molecular orbital (MO) theory, etc (https://chem.libretexts.org/Bookshelves/Inorganic_Chemistry/Inorganic_Coordination_Chemistry_(Landskron)/07%3A_Coordination_Chemistry_II_-_Bonding/7.01%3A_Theories_of_Electronic_Structure). Each theory looks at the problem in hands from different perspectives. In a way, it is like several blind men trying to tell what an elephant looks like by touching different parts of its body in the well-known Indian parable. Need to point out every theory has its own limitations. As scientists trained after modern MO theory has been established, we would prefer to rely on the more advanced MO theory but VBT and CFT are still simple, elegant, and useful. Note that even MO theory cannot explain it all and sometimes runs into problems, when we have to resort to even higher level of theories that include electron correlations and/or relativistic effects. At the moment, different theories can be used together to gain a unified understanding, as long as their interpretations do not contradict with each other.