Question: "So what is the definition of moduli space, in math vs in physics, after all these?"
Answer: In algebra and algebraic geometry, mathematical physics etc. you want to classify objects such as algebraic varieties/schemes, vector bundles, vector bundles with a connection, vector bundles with a flat connection etc.
Example: Let $C$ be a smooth projective curve over a field $k$ and let $E$ be a finite rank vector bundle on $C$. How large is the "parameter space" of connections on $E$? What about the "parameter space of flat connections on $E$"? In this case there is an obstruction class $a(E)$ which measures when $E$ has a connection: $a(E)=0$ iff $E$ has a connection. There is also a "parameter space"
$$ Conn(C,E):=H^0(C, End(E)\otimes \Omega^1_{C/k})$$
parametrizing all connections on $E$. The space $Conn(C,E)$ is a finite dimensional vector space over $k$. There is in many cases an algebraic group $G$ acting on $Conn(C,E)$, and the "orbits" of this action correspond to isomorphism classes of connections $(E,\nabla)$. This is where "moduli spaces" appear: You want to take the "quotient" $Mod(C,E):=Conn(C,E)/G$ of the action of $G$ and you want the quotient $Mod(C,E)$ to "behave" like a scheme. You want to have access to notions such as dimension, non-singularity, irreducibility, vector bundles, algebraic cycles, cohomology/homology etc. This is what "moduli spaces" do - they are objects with properties similar to schemes.
Question: "So what is the definition of moduli space?"
Answer: If $X$ is a scheme parametrizing a set of objects, and if $G$ is an algebraic group (or groupoid) acting on $X$, the associated moduli space is the "quotient" $X/G$. Sometimes you write $[X/G]$ if you consider the stack quotient.
Example: If $X:=Spec(A)$ is an affine scheme and if $G$ is a finite group acting on $A$, the quotient map $\pi: X \rightarrow X/G$ is given by the inclusion $A^G \subseteq A$ where $A^G$ is the sub ring of $G$-invariant elements.
Let $A:=k[x]$ and $S_d$ the symmetric group on $d$ elements with $B:=k[x]\otimes_k \cdots \otimes_k k[x]$ and obvious action $S_d \times B \rightarrow B$. It follows the invariant-ring $B^{S_d}=k[s_1,..,s_d]$ where $s_i$ are the elementary symmetric polynomials. The polynomials $s_i$ are algebraically independent, hence $B^{S_d}$ is a polynomial ring. It follows
$$Spec(B^{S_d}) \cong \mathbb{A}^d_k.$$
By definition $Sym^d(X):=X^{\times d}/S_d$, hence $Sym^d(\mathbb{A}^1_k)\cong \mathbb{A}^d_k$. If $k$ is algebraically closed it follows any $k$-rational point $p\in \mathbb{A}^d_k(k)$ corresponds to a maximal ideal
$$(t_1-a_1,..,t_d-a_d) \subseteq k[t_1,..,t_d].$$
We may view the point $p$ as the polynomial
$$f_p(T):=(T-a_1)\cdots (T-a_d)$$
in the variable $T$. When we multiply out $f_p(T)$ we get
$$f_p(T)=T^d-s_1(a_i)T^{d-1}+\cdots +(-1)^ds_d(a_i)$$
where $s_j(a_i)$ are the symmetric polynomials in the numbers $a_i$.
The map
$$\pi_d: (\mathbb{A}^1_k)^{\times d} \rightarrow Sym^d(\mathbb{A}^1_k)$$
does the following:
$$\pi_d(a_1,..,a_d):=(s_1(a_i),..,s_d(a_i))$$
hence it maps the polynomial $f_p(T):=\prod_i (T-a_i)$ to the polynomial
$$\pi_d(f_p(T)):=T^d-s_1(a_i)T^{d-1}+\cdots +(-1)^ds_d(a_i).$$
Hence $Sym^d(\mathbb{A}^1_d)$ is the "moduli space" parametrizing polynomials by its coefficients.
Note: What is mysterious here is that $\mathbb{A}^d_k/S_d \cong \mathbb{A}^d_k$. By the "going up theorem", since the ring extension
$$A:=k[s_i ] \subseteq B:=k[t_i]$$
is integral, it follows for any prime ideal $\mathfrak{q} \subseteq A$ there is a prime ideal $\mathfrak{p} \subseteq B$ with $\mathfrak{p}\cap A=\mathfrak{q}$. In particular if $\mathfrak{q}:=(s_1-a_1,..,s_d-a_d)$ is maximal, there is a maximal ideal $\mathfrak{p}:=(t_1-u_1,..,t_d-u_d) \subseteq B$ with $\pi_d(\mathfrak{p})=\mathfrak{q}$. Hence the map
$$\pi_d: \mathbb{A}^d_k \rightarrow Sym^d(\mathbb{A}^1_k) \cong \mathbb{A}^d_k$$
is surjective at the level of topological spaces with finite fibers. This is a paradox: The map $(\pi_d)_{t}$
$$(\pi_d)_t: (\mathbb{A}^d_k)_t \rightarrow (\mathbb{A}^d_k)_t$$
is a surjective "endomorphism" of affine space which is not injective. This paradox is due to the fact that the sub ring ring $k[s_1..,s_d] \subseteq k[t_1,..,t_d]$ is isomorphic to $k[t_1,..,t_d]$: The map
$$\phi:k[t_j] \rightarrow k[s_j]$$
defined by $\phi(t_i):=s_i$ is an isomorphism of rings. Hence $k[t_j]$ is isomorphic to a strict subring of itself. When dealing with infinite sets we "must accept" such paradoxes.
Group actions on sets: If you (naively) have a finite set $S$ and a finite (non trivial) group $G$ acting non-trivially on $S$ via
$$\sigma: G \times S \rightarrow S$$
it follows the quotient $S/G$ has fewer elements than $S$. For schemes you may end up with an isomorphism $X/G \cong X$.
Closed subgroups of algebraic groups: If $H \subseteq G \subseteq GL_k(V)$, are closed subgroups with $k$ a field and $V$ a finite dimensional $k$-vector space, you may always construct the "quotient"
$$\pi: G \rightarrow G/H,$$
and $G/H$ is a smooth quasi projective algebraic variety of finite type over $k$. The grassmannian and flag varieties are examples of such quotient varieties. If $W\cong k^m \subseteq V\cong k^n$ is a sub vector space and if $P \subseteq SL(V)$ is the (closed) subgroup fixing $W$ it follows $SL(V)/P \cong \mathbb{G}(m,V)$ is the grassmannian variety, parametrizing $m$-dimensional subspaces of $V$.
There are many paradoxes in this theory, but if you are interested there is a homepage devoted to this (the "stacks homepage" in NY) where a "general theory of moduli spaces" is developed. You will find much material on this page (more than 6000 pages of math).
Here you find a relevant discussion at MO:
https://math.stackexchange.com/questions/tagged/algebraic-stacks
https://mathoverflow.net/questions/tagged/stacks
https://mathoverflow.net/questions/123942/how-many-flat-connections-has-a-line-bundle-in-algebraic-geometry/393770#393770
This phenomenon that you get paradoxical situations when taking quotients appears elsewhere. In this thread I construct a "curve" $C$ with a non-trivial equivalence relation $R$ such that the "quotient" $S:=C/R$ has dimension $2$.
"Infinity" in mathematics and an elementary question on dimension.
Ex: Let $A:=\mathbb{Z}$ and $Q:=\mathbb{Z}/(p)\mathbb{Z}$ with $p$ a non-zero prime number. Define the map $\phi: A \rightarrow A$ by $\phi(n):=pn$. It follows $\phi \in End(A)$ is an injective endomorphism of the abelian group $A$ giving an exact sequence
$$0 \rightarrow A \rightarrow^{\phi} A \rightarrow Q \rightarrow 0,$$
hence the map $\phi$ is an isomorphism between $A$ and a strict subgroup $\phi(A) \subseteq A$ of $A$. Hence this type of construction/set theoretic paradox appears everywhere in mathematics. You should ask a set theorist how to fix this problem - a brilliant set theorist.
As you can see: You may ask similar questions for "stacks" as you ask for schemes:
- What are the irreducible components of a "stack"?
- What is its dimension?
- What is the "etale fundamental group" of a "stack"?
- Can I view "the theory of stacks" as a "black box" and prove important theorems?