I'll take a crack at this.
Any time you make a logical argument, you maybe presume some things that are true (your axioms) and always presume some rules about how you can work with true premises to reach true conclusions (your rules). Together, they form a theory.
Maybe I'm doing some set theory, and I'm working in the theory $\mathsf{ZFC}$. Or maybe I'm doing some arithmetic and I'm working in Peano Arithmetic, $\mathsf{PA}$. But then I get curious, or maybe a little manic, or maybe my name is Hilbert, and I start thinking things like "I wonder if can I be sure the assumptions of my theory don't contradict each other?" or "I wonder if I can prove everything that is true with this theory?" or "My name is Hilbert and I can prove everything with this theory." Then a smart kid by the name of Gödel comes a long and shows that actually, you can't. You can construct a model of $\mathsf{PA}$ inside $\mathsf{PA}$ and then show $\mathsf{PA}\not\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$.
It sounds like this all probably isn't news to you. But here's the key point:
$$\mathsf{ZFC}\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$$
and trivially so! $\mathsf{PA}$ is intended to model that natural numbers and very basic properties of them, so the natural numbers within $\mathsf{ZFC}$ serves as a model quite readily.
The takeaway here is that whether or not something is provable is not just a function of its truth, but also of the theory the proof is to be conducted in. Always. Even (especially) if this theory is informal.
When we're proving theorems about a theory, whether explicitly like making statements like the ones above, or implicitly by say, doing axiomatic set theory and working within a particular object theory like $\mathsf{ZFC}$, the theory the theorems are argued in is called the metatheory. In fields like model theory, semantics, and constructive mathematics, we're often explicit about what our metatheory is. In logic and other axiomatized fields, we're usually at least careful to be precise about what properties it has. In other fields, less so these things.
We are always working in a metatheory.
What metatheory? Well, it depends, and also often doesn't matter; much of mathematics is not all that precise about it. Here's a selections of properties that are important:
Is your metatheory classical?
Informal status: if unstated, almost certainly yes.
That is, is $P\vee\neg P$ true for all statements $P$? It's very intuitive that this should be true, and used frequently. "Either P or not P. If P, Q. Also if not P, Q. Thus, Q." But it has some nasty properties that lead some fields of math to avoid it. Due to how intuitive it is, it's really easy to footgun yourself by accidentally using it in informal reasoning when you mean to avoid it. It also likes to hide in other seemingly innocuous statements. Informal reasoning tends to just take it as fact.
To what extent does your metatheory admit infinities?
Informal status: Usually. Sometimes rejected so that the metatheory has nice properties w.r.t. computation or as a philosophical exercise. Countable infinities crop up here and there informally without concern. Uncountable infinities are very suspicious (but also very rare).
Is your metatheory okay with making arguments about an infinite collection? Generally the bar is "effectively enumerable," as in in principle you could go through all the elements in a particular order. Then the infinity is tamer, and this becomes the same as asking...
Does your metatheory admit induction?
Informal status: Usually. May be avoided as a philosophical exercise.
Not much more to say here. Talking about unending generalities is useful, it's done frequently, only the strictest of finitists rejecting this as a philosophical exercise.
I will mention two theorems that are common to take for granted that do come from metatheoretic induction, because knowing that they are arguments in the metatheory and not provable in their respective theories is an important point of order: the deduction theorem, that if you can deduce $Q$ from assuming $P$ then you can conclude $P\rightarrow Q$, and predicate extensionality, $\forall x\forall y[x = y\rightarrow\phi(x)=\phi(y)]$.
Does your metatheory admit choice?
Informal status: Depends. If you're doing very object level math where the metatheory is implicitly some sort of set theory, like analysis or topology, then usually yes, although the use of choice is generally noteworthy. Otherwise usually not, although category theorists are notoriously bad about this.
This one is mind-bending, and a whole thing. Basically argument forms like "There is an X, consider one such X" or "Consider each X in turn", which are innocuous if you make finitely many such choices, can become a whole other thing if you have to make infinitely many such choices or consider over an infinite set. Much has been written about choice and why one may consider leaving it behind, I will not pretend to be able to summarize it succinctly here.
Those are some of the properties you might consider. Technically there's other non-trivial properties, like for example whether talking about "functions from A to B" even makes sense, but virtually every metatheory takes them for granted so you will rarely worry about them. It otherwise usually doesn't matter exactly what the metatheory is, because the point of much of mathematics is that the metatheory is abstracted away, so you can focus on the object theory. Most cases you will be working in a classical metatheory with induction and infinity, with or without choice. It might as well be $\mathsf{ZFC}$ or $\mathsf{ZF}$, depending. If you're more into modern foundations you might consider the metatheories to be $\mathsf{MLTTW + EM}$ with or without non-constructive choice. If you're really out there you might consider yourself working in a topos with or without enough projectives. It really... just... doesn't matter. Pick your poison. There isn't a best theory, and none of these theories, even unspecified informal ones, are immune to formalization and analysis by yet a more metatheory, if you were so inclined. As you might be catching on to...
There isn't a "most" meta theory. No one theory is "best" or "canonical."
So how do we know that the informal theories that we adopt casually are fair to be considered to coincide with are formal ones?
Well, at one point, we didn't! For a time we had $\mathsf{ZF}$, and then choice was discovered when close inspection of informal theories of analysis showed that presuming the reals admitted a well-order was a non-trivial assumption. In the very beginning we had unrestricted comprehension, until we found out that it's inconsistent. But at this point, we've learned a lot about theories of logic and sets, and we know all of known informal mathematics (modulo some sticky points at the limits of category and type theory) can be modelled by $\mathsf{ZFC}$, and furthermore that $\mathsf{ZFC}$ is a comically powerful theory, so identifying informal reasoning and $\mathsf{ZFC}$ is a conservative assumption.
We identify formal and informal proofs because the formal notion of proof in modern mathematics is very strong, and it's a conservative assumption to identify them.
But just as we've been wrong about comprehension and choice in the past, we could be wrong about this too. Indeed, informal reasoning outside of the known reaches of $\mathsf{ZFC}$ is common, and is important to steering how research is conducted. For example, you ask working computer scientists if $\mathsf{P}=\mathsf{NP}$, and they likely have an informal argument (probably) that they are not equal. Most research then takes the flavor of trying to find descriptions of these classes that are precise and tractable enough to distinguish them conclusively.
Once an argument coalesces into something formalizable, it becomes natural to then ask how this relates to other theories with known properties. Can the premises be proven from another theory? If not, is it philosophically sensible to include these premises in our metatheory from herein out? Without answering these questions, an informal argument really fails to be very meaningful. After all, as far as logic is concerned "suppose $\neg\mathsf{CH}$, then $\neg\mathsf{CH}$" is as good an informal proof as any that the continuum hypothesis is false, but it doesn't elucidate anything. Hence why we asked these questions, and discovered it was independent from $\mathsf{ZFC}$, and that both supposing it and its negation have interesting consequences.
We don't know that an informal "proof" of the continuum hypothesis doesn't exist (and I'd argue there's actually some fairly convincing arguments for taking it as true), but we do know it is independent of $\mathsf{ZFC}$, so whatever metatheory that proof may come in, it is stronger than $\mathsf{ZFC}$
It's tempting to cling onto the informal notion of proof as some truth, but in the echo of every proof is always a metatheory. If that metatheory is philosophically compelling, then you may consider the proof philosophically compelling as well. If you don't, then the proof is perhaps a mere mathematical curiosity (or triviality, as the case may be).
It's metatheories all the way down. Where you decide to stop is a matter between you and the fractal.
And, a humble word of advice from a constructivist: don't try to get smart with $\mathsf{ZFC}$. It's got hands. It's probably already whispering more lies than truths. It'll treat you well.
Classes: An Appendix
Aah yes. Classes. The budding set theorist's first point of metatheoretic subtlety. Here's the deal with classes. If we're playing strictly by the rules, there are no classes. They are merely syntactic abbreviations, as follows:
$$\{x\ |\ \phi(x)\}\in\{y\ |\ \psi(y)\} := \exists x[\forall y[y\in x\leftrightarrow\phi(y)]\wedge x\in\{y\ |\ \psi(y)\}]$$
$$x\in\{y\ |\ \psi(y)\} := \psi(x)$$
This is confusing, both because you very quickly build up layers upon layers of nested abbreviations, and because informal arguments start getting a little fast and loose about whether classes are really objects of the theory. But they are just abbreviations.
Yes, classes are just abbreviating statements about the properties of sets. Working with classes is in a very precise sense just working with properties of their elements.
The idea is, in principle, you could unroll those abbreviations and pin down the formalities, and you'd just get theorems about intensely complicated statements not involving classes. You just really don't want to do that. For example, here's a humble extension axiom for $\mathsf{ZFC}$ concerning large sets. With 163 symbols (not including the expansion of equality which is taken as primitive in this metatheory), it's basically illegible without abbreviations. But it can be done.
Now $\mathsf{NBG}$ is just a theory that says, look, if we're serious about classes being "just an abbreviation" and our informal treatment of classes kinda sorta seems like they're sets, surely this should be formalizable. And it turns out it is, and furthermore, it has a nice, expected property: we say $\mathsf{NBG}$ is a conservative extension of $\mathsf{ZFC}$, meaning that if provable statement of $\mathsf{NBG}$ looks like a statement of $\mathsf{ZFC}$ (or in the case of $\mathsf{NBG}$ even stronger: if it can be interpreted as a statement of $\mathsf{ZFC}$ by the natural interpretation of classes in $\mathsf{ZFC}$), then that statement is also provable in $\mathsf{ZFC}$. This is why informally we consider the theories interchangeable, because there's a very narrow formal sense in which they prove the same things.