I already asked a similar question about why replacement is true in the hierarchy of sets. After a while of thinking about it and having the discussion there, I finally understand it. Now the next axiom with which I have trouble understanding is the axiom of regularity. It says that if we have a nonempty set $A$ then there is an element $x\in A$ such that $x$ and $A$ are disjoint. The usual explanation of this is that we choose for $x$ an element of $A$ which is created on a stage "as low as possible". Then all elements of $x$ are created before $x$ and all elements of $A$ are created after $x$ or at the same stage as $x$. Thus $x\cap A=\emptyset$.
But how can one assure that there is always such a "minimal stage" for a nonempty set $A$? Why isn't it possible that in the transfinite hierarchy, we can find a set $A$ such that for each member $x$ of $A$, there is another member $y$ of $A$ which is created before $x$?
I found this mathoverflow answer of Andreas Blass to a very similar question:
Begin with some non-set entities called atoms ("some" could be "none" if you want a world consisting exclusively of sets), then form all sets of these, then all sets whose elements are atoms or sets of atoms, etc. This "etc." means to build more and more levels of sets, where a set at any level has elements only from earlier levels (and the atoms constitute the lowest level). This iterative construction can be continued transfinitely, through arbitrarily long well-ordered sequences of levels.
This so-called cumulative hierarchy is what I (and most set theorists) mean when we talk about sets. A set is anything that is formed at some level of this hierarchy. This meaning of "set" has replaced older meanings.
The axiom of regularity is clearly true with this understanding of what a set is. It expresses the idea that the stages of the cumulative hierarchy come in a well-ordered sequence. (Without well-ordering, the instructions for each level, namely "form all sets whose elements are at earlier levels," would not be an inductive definition but a circularity.)
I am always depressed when somebody says that something is "clearly true" and I have trouble understanding it. But on the other hand, mathoverflow is intended for professional mathematicians, and thus Andreas Blass may have written this to be understand by other professional mathematicians (but not by beginners like me).
In this thread, I basically ask for a more detailed explanation that can be easier understood. What have inductive definitions to do with minimal stages?