In (symbolic) logic, we first of all define a language $\mathcal L$. That is, we describe the type of expressions we want to use. These include common "logical" ones, like $\land, \lor, \exists, \forall$ ("and", "or", "there exists", "for all"), but also some additional "symbols", for lack of a better word that isn't jargon.
These undefined terms are where $\in$ and $=$ are for the language of set theory $\mathcal L$. More specifically, they are "binary relation symbols". This terminology says nothing but that we are justified in writing down $A \in B$ and $A = B$ as part of $\mathcal L$. We could also have introduced a "unary function symbol" $f$, and then we could have used expressions like $f(x), f(f(x))$ etcetera.
Any expression that is permissible in $\mathcal L$ we call a formula ($\mathcal L$-formula if we want to be really precise).
We are now ready to begin moving towards semantics, i.e. assign meaning to what we have been doing. Once the language has been specified, we elect some formulas as axioms. For axiomatic set theory, these are usually the Zermelo-Fraenkel axioms. Such a set of axioms is also known as a theory. Let us call the theory for Zermelo-Fraenkel set theory $\sf ZF$.
Now the big leap of faith is that we can come up with some structure that, in an intuitive way, provides a means to interpret the formulas from $\mathcal L$ and decide if they are true or false. In particular, we need to be able to make sense of expressions like $A \in B$. We aptly call such an intuitive construct an $\mathcal L$-structure.
The final step is to consider a model $\mathcal M$ for $\sf ZF$: An $\mathcal L$-structure in which all formulas from the theory $\sf ZF$ are true.
This is basically where the intuition ends. We can now start defining things in terms of $\in$ and $=$, like $\subseteq$, the notion of a function, and indeed, a relation. All of these concepts are formulated in the language $\mathcal L$, and we look at them from the perspective of our model $\mathcal M$.
The fundamental distinction here is that what we intuitively think of as a "relation" is modeled (or axiomatised) in the language $\mathcal L$ (e.g. "A relation $\mathcal R$ between sets $A$ and $B$ is a subset $\mathcal R \subseteq A \times B$"). The power of the model is that it allows us to be very precise and certain about the validity and justification of what we are doing (in that it does not depend on two persons sharing intuition); its weakness lies in that it cannot completely reflect our intuition about sets (see e.g. Russell's paradox).
Your confusion arose from the obfuscated distinction between the intuitive and the axiomatic. This is an intricate subject that requires some time to get used to, but I hope that at the very least, this helps you to acknowledge the existence of the two perspectives.