Reconciling Different Definitions of Language

Question

Definition 1 (Formal Language). A language $L$ over an alphabet $\Sigma$ (any nonempty finite set) is a subset of the set of all finite sequences of elements of $\Sigma$, i.e. $L\subseteq\bigcup_{n=0}^{\infty}\Sigma^n$. See https://planetmath.org/language.

Definition 2 (in Propositional Calculus). I couldn't even find a good definition; is a propositional language a set? Or is it something else? There's sentence symbols that feel like the variables from Def. 3, and then there's the connectives, and I understand what expressions and sentences are, but what is the language? (I'm reading Hinman's Fundamentals of Mathematical Logic)

Definition 3 (in First-Order Logic). The classic definition, see In Mathematical Logic, What is a Language? or https://planetmath.org/firstorderlanguage.

Now, is Def. 2 a special case of 3, and both irreconcilable with 1, or is 2 < 3 < 1? But if the latter is the case, how come alphabets must be finite? Or are the symbols from 2 and 3 considered to be combinations of a finite alphabet (which would be reasonable to assume, looking at e.g. $x_{32}$ or $A'''$)?

Thanks a lot in advance.

Wuestenfux · Answer 1 · 2019-03-26T11:57:10.780

0

Well, a formal language $L$ over an alphabet $\Sigma$ is indeed a subset of $\Sigma^*$, the set of all (finite) words over $\Sigma$.

The definition of a language of propositional logic or first-order logic is quite different, since it is given by all words that are correctly built according to the underlying syntax. Its also a subset of $\Sigma^*$, but it consists only of well-formed formulae.

edited Mar 26 '19 at 11:57

answered Mar 26 '19 at 11:50

Wuestenfux

20,964

But the set of all such correct words is still just a subset of $\Sigma^*$, so why would it be different in the end? But also, a language in logic is more than just the set of correct words, right? – steve Mar 26 '19 at 11:54
I agree a language of propositional logic (or first-order logic) is not at all different, it is still the same concept of being a subset of the set of strings over some alphabet. The way to define this subset may however be more complicated than for simple languages. – blub Mar 26 '19 at 12:24

score 0 · Answer 2 · edited Jun 12 '20 at 10:38

Propositional calculus

A language $L$ needs an alphabet (or lexicon) $\text {Alpha}_L$ consisting of

(i) proposition symbols: $p_0,p_1,p_2, \ldots$,

(ii) connectives: $∧,∨,→,¬,↔,⊥$, and (iii) auxiliary symbols: $( , )$.

We may say that an expression is a finite sequence of symbols from $\text {Alpha}_L$ : $(p_0\bot\lnot), ((p_0 \land p_1) \to p_3), \ldots$.

Finally, we define rules to identify, among the set of expression, the subset of formulas (aka : well-formed formulas), i.e. the subset of those expression that satisfy certain "grammatical" Rules :

The set $\text {PROP}_L$ of formulas of propositional calculs is the smallest set $X$ with the properties

(i) $p_i ∈ X \text { and } ⊥∈ X$, called atomic formulas;

(ii) if $ϕ,ψ ∈ X$, then $(ϕ ∧ ψ), (ϕ ∨ ψ), (ϕ → ψ), (ϕ ↔ ψ) ∈ X$;

(iii) if $ϕ ∈ X$, then $(¬ϕ) ∈ X$.

As for the case of natural language, the language of propositional calculus needs an alphabet and a grammer.

In the same way, the language of propositional calculus is made of the alphabet (the collection of symbols) with the syntactical rules specifying what counts as a "gramamtically correct" expression, i.e. a formula.:

As you can see, the above definition is consistent with the general definition of formal language.

If we want a definition of formula à la Hinman (see page 15) we have an initial set $\text {Symb}_L$ of $L \text {-symbols}$ (corresponding to $\text {Alpha}_L$ above) and the set $\text {Exp}_L$ of all finite sequences of these symbols, i.e. the he set of $L \text {-expressions}$.

And finally :

$\text {Prop}_0 := \{ \bot, p_0, p_1, \ldots \}$ is the set of atomic formulas;

$\text {Prop}_{n+1} := \text {Prop}_n \cup \{ \lnot \phi : \phi \in \text {Prop}_n \} \cup \{ (\phi \circ \psi) : \phi, \psi \in \text {Prop}_n, \circ \in \{ ∧,∨,→,↔ \} \};$

$\text {Prop}_L := \bigcup \text {Prop}_n$.

A member of $\text {Prop}_L$ is called a propositional formulas (aka : well-formed formula).

The same for predicate calculus : the set of symbols of the alphabet is different (it is enlarged with terms and quantifiers) and the syntax rules are modified accordingly.

blub · Accepted Answer · 2019-03-26T12:43:05.887

I'll look at this question only from the perspective of classical propositional logic and classical first-order logic, not taking into account all the other kinds of exotic logics out there.

As you said, a formal language $\mathcal L$ is a subset of the set of all sentences over an alphabet $\Sigma$, which you commonly write as $\Sigma^*=\bigcup_{n\in\mathbb N}\Sigma^n$.

In the case of classical propositional logic we want to reason about formulas, that is strings over a certain alphabet, made up from Boolean connectives using a set of atomic propositional variables, say $Var=\{p_0,p_1,\dots\}$. For this, we may define an alphabet $\Sigma_p=Var\cup\{(,),\land,\lor,\neg\}$ using the variables and a bunch of auxiliary symbols we want to use later in our formulas. The question now remains how we ´´filter´´ the correct formulas like $(p_0\lor p_1)$ about which we would like to reason later from the whole set $\Sigma_p^*$ which contains a lot of nonsensical strings, e.g. $((\lor\neg$.

In other words, we may phrase the question as how to obtain the language of classical propositional logic, say $\mathcal L_p$, from the set of strings $\Sigma_p^*$. This can for example be achieved by specifying $\mathcal L_p$ to be the smallest subset $X$ of $\Sigma_p^*$ to satisfy the following closure conditions.

$Var\subseteq X$,

If $\phi,\psi\in X$, then $(\phi\land\psi),(\phi\lor\psi),\neg\phi\in X$.

There are of course a lot of other ways to define a language for classical propositional logic. On the basis of this language however, we can the move on to specify things like semantical and syntactical consequences (and all the other things commonly studied in logic).

Note, that you may as well add other logical symbols like $\rightarrow$, $\leftrightarrow$ to the alphabet and language over additional clauses for specifying $\mathcal L_p$. The way I chose was to restrict myself to some subset of connectives sufficient to semantically (if we later have such a thing as a semantics) specify the other connectives by abbreviations.

We may proceed in a similar way for classical first-order logic. In a similar spirit, we define a base alphabet (which is now a little more complex).

For this, let $\sigma=\sigma_f\cup\sigma_r\cup\sigma_c$ be a so called signature, the set of function ($\sigma_f$), relation ($\sigma_r$) and constant symbols ($\sigma_c$) in the first_order language we want to define. We define a function $\mathrm{ar}:\sigma\to\mathbb N$ (the so called arity function, telling you, well, what arity the symbols have). Every first-order language is relative to such a signature. We commonly set $\mathrm{ar}(c)=0$ for $c\in\sigma_c$, i.e. constant symboles have an arity of $0$.

We also fix a set of variables $V=\{x_0,x_1,\dots\}$. Then, we can define the alphabet of first-order logic over the signature $\sigma$ as

$$\Sigma_\sigma=V\cup\sigma\cup\{(,),\neg,\lor,\land,=,\forall,\exists\}$$

where it is also sometimes customary to leave out the equality symbol.

Again, we now pose the question of how we can distill the desired language of first-order logic over $\sigma$, say $\mathcal{L}(\sigma)$, from the set of all strings $\Sigma_\sigma^*$.

For this, you commonly go on to specify what you mean by terms over the signature $\sigma$ and then define the formulas in $\mathcal{L}(\sigma)$ from that.

Let me know in the comments if I shall elaborate on that.

So the language per se is indeed just a subset of $\Sigma^*$ (which by now I gather may infinite), and all the other things we talk about are just rules we use to specify the language, but not part of the language itself? — steve, Mar 26 '19 at 12:06
First, if $\Sigma$ is non-empty, then $\Sigma^*$ is definitely infinite (check that yourself to get a little grip on alphabets and languages). Secondly, we want a language of formulas for the logic to then talk about them in a semantic or syntactic way. The question is then of how to specify such a language, i.e. how to distill the right subset from the set of all strings, giving us the ones we can put a meaning on and leaving out those which are not of interest for us. This can be, and is commonly, done in such a way as I presented it, i.e. using rules to specify a smallest subset. — blub, Mar 26 '19 at 12:12
Sorry, I meant $\Sigma$ is infinite, that $\Sigma^*$ is infinite is clear. — steve, Mar 26 '19 at 12:15
Ah yes so in this case we mostly require that alone from the variables as we commonly want to talk about countably many variables in some way. However, in first-order logic it is sometimes desirable to restrict the number of variables to a finite set (in e.g. finite model theory) and then (assuming a finite signature) the alphabet may as well be finite. — blub, Mar 26 '19 at 12:18
Thank you a lot for the clear answer. One thing though, I believe if you leave out $\implies$ and $\iff$, it is also customary to leave out $\land$ and $\forall$. — steve, Mar 26 '19 at 12:31
It is, sometimes. The symbols $\land,\lor,\neg$ are often taken together, although as you said e.g. $\lor,\neg$ would suffice, as they have the historical background of being THE set of Boolean connectives. Normally, I personally would either go for $\neg,\lor,\exists$, $\land,\neg,\forall$ or $\rightarrow,\land,\bot,\forall,\exists$ (as the latter two are customary in some contexts) and then define the others as abbreviations. — blub, Mar 26 '19 at 12:42

Reconciling Different Definitions of Language

3 Answers3