Unambiguous set abstraction (set-builder) notation with parameters

Question

I know two common variants of set abstraction notation. An examples of the first variant is $$ \{\,(x, y)\in\mathbf{R}^2\,|\,x^2 + y^2 = 1\,\} $$ (it is reminiscent of the axiom schema of specification of ZF), and an example of the second is $$ \{\,(\cos t,\sin t)\,|\,t\in\mathbf{R}\,\} $$ (it is reminiscent of the axiom schema of replacement of ZF). I think that in common mathematical tradition, these two "set-builder" expressions unambiguously denote the same set.

The first form of the notation can always be used instead of the second, for example: $$ \{\,(\cos t,\sin t)\,|\,t\in\mathbf{R}\,\} = \{\,(x, y)\in\mathbf{R}^2\,|\,(\exists t\in\mathbf{R})(x =\cos t\wedge y =\sin t)\,\}. $$ However, the "translated" expression is noticeably more verbose.

Sometimes the two are combined: $$ \{\,(\cos t,\sin t)\,|\,t\in\mathbf{R}\wedge 0 < t <\pi\,\}, $$ or simply $$ \{\,(\cos t,\sin t)\,|\,0 < t <\pi\,\}. $$

The first form can be translated into the combined form without much "overhead": $$ \{\,(x, y)\in\mathbf{R}^2\,|\,x^2 + y^2 = 1\,\} = \{\,(x, y)\,|\,x^2 + y^2 = 1\wedge (x, y)\in\mathbf{R}^2\,\}. $$ The second form is just a particular case of this combined form.

In practice, the combined form of "set-builder" notation is quite convenient, but it appears to be ambiguous, unless some artificial syntactic conventions are made. For example, what set $S$ is defined in the following sentence?

Let $p = 1$ and $S = \{\,(p + q, p - q)\,|\,pq = 1\wedge p\in\mathbf{R}\wedge q\in\mathbf{R}\wedge 1 < 2\,\}$.

I see two equally legitimate values for $S$ so defined:

$S =\{(2, 0)\}$ or
$S =\{\,(x, y)\in\mathbf{R}^2\,|\,x^2 - y^2 = 4\,\}$.

Once you think of it, there is also a possibility for $S$ to be empty, if both $p$ and $q$ are viewed as parameters and if the value of $q$ happens to be different from $1$.

Does there exist an unambiguous form of the combined variant of "set-builder" notation?

In general, I do not see how to specify in an expression like $$ \{\,\{\,f(\bar a,\bar b,\bar c)\,|\,\Phi(\bar a,\bar b,\bar c)\,\}\,|\,\Psi(\bar a,\bar b,\bar c)\,\} $$ that, for example, $\bar c$ are parameters, and $\bar b$ are parameters in the inner "set-builder" but not in the outer one. (Note that though all the variables are listed in parentheses each time, it does not mean that all of them are "used": $f$, $\Phi$ and $\Psi$ may very well not depend on some of these "formal parameters.")

For reference, here is how some programming languages deal with a similar issue in list comprehension. More precisely, in programming languages there is no issue, because iterating over a collection and testing a predicate are two different operations in programming, while they are indistinguishable in mathematics.

In Haskell,

[ (m, n) | m <- [0..5], 0 <= n, n <= 5, m^2 + n^2 == 25 ]

is the list of all pairs $(m, n)$, where $m$ is an integer from $0$ to $5$, $0\le n\le 5$, and $m^2 + n^2 = 25$. Note that this very sentence in English is ambiguous! The above expression in Haskell is only valid if $n$ is defined. The value of this expression is [(3,4)] if the value of $n$ is 4, but it is [] (the empty list) if the value of $n$ is 2, for example.

A similar definition in Python will be:

[ (m, n) for m in range(0, 6) if 0 <= n and n <= 5 and m**2 + n**2 == 25 ]

However, the analogous "set-builder" notation $$ \{\,(m, n)\,|\,m\in\{0,\dotsc,5\}\wedge 0\le n\le 5 \wedge m^2 + n^2 = 25\,\} $$ is ambiguous (are "$m$" and "$n$" parameters or bound variables of this expression?).

Here is another example of an ambiguous situation: $$ \{\,\{\,x\,|\,x\in\mathbf{Z}\}\,|\,x\in\mathbf{R}\,\}. $$

For my personal needs, I start considering using my own notation $\{\ldots|\ldots|\ldots\}$ with bound variables listed in the middle, with the following formal "translation": $$ S =\{\,f(\bar x, \bar y)\,|\,\bar x\,|\,\Phi(\bar x, \bar y)\,\}\\ \equiv (\forall\xi)(\xi\in S\Leftrightarrow(\exists\bar x)(\xi = f(\bar x, \bar y)\wedge\Phi(\bar x, \bar y))), $$ where the new variable $\xi$ is distinct from all variables in $\bar x$ and $\bar y$. For example: $$ A =\{\,(\cos t,\sin t)\,|\,t\,|\,t\in\mathbf{R}\,\},\\ B =\{\,(x, y)\,|\,x, y\,|\,x^2 + y^2 = 1\wedge (x, y)\in\mathbf{R}^2\,\},\\ C =\{\,(m, n)\,|\,m\,|\,m\in\{0,\dotsc,5\}\wedge 0\le n\le 5 \wedge m^2 + n^2 = 25\,\},\\ D =\{\,\{\,f(\bar a,\bar b,\bar c)\,|\,\bar a\,|\,\Phi(\bar a,\bar b,\bar c)\,\}\,|\,\bar a,\bar b\,|\,\Psi(\bar a,\bar b,\bar c)\,\},\\ E =\{\,\{\,x\,|\,|\,x\in\mathbf{Z}\,\}\,|\,x\,|\,x\in\mathbf{R}\,\}\quad (=\{\,\{x\}\,|\,x\,|\,x\in\mathbf{Z}\,\}\cup\{\{\}\}), $$ or, alternatively, with words instead of symbols: $$ A =\{\,(\cos t,\sin t)\ \text{for}\ t\ \text{such that}\ t\in\mathbf{R}\,\}. $$ Isn't there some kind of a standard alternative?

It seems that the Z notation might be an example of what I am looking for, but at a first glance it looks incompatible with the usual mathematical "set-builder" syntax, and I wonder how standard it is.

Some comments and answers suggest to never use the same variable name for different variables, at least when their scopes are nested, thus avoiding variable shadowing altogether. IMO, such a restriction is unusual (surely uncommon in programming or formal languages), unnecessary, not easy to observe, and it would require explicit introduction of all parameters (consider omitting "Let $p = 1$" in my example above while still using "$p$" as a parameter).

Quoting the HoTT Book (page 23),

Of course, this should all be familiar to any mathematician: it is the same phenomenon as the fact that if $f(x):\equiv\int_1^2\frac{dt}{x - t}$, then $f(t)$ is not $\int_1^2\frac{dt}{t - t}$, but rather $\int_1^2\frac{ds}{t - s}$.

How do you interpret $S$ as anything but ${(2,0)}$? Since $p=1$, you can rewrite $S$ as ${(1+q,1-q)\mid q=1,q\in \mathbb R,1<2}$ (also, what is the point of the "$1<2$"?), so the only possible value of $q$ is $1$. — Mike Earnest, Aug 10 '18 at 14:04
@DanielWainfleet Nah you get ${(2,0)}$ because the first one is the sum. I don't think there is a standard alternative and I don't really see a reason for one. What you wrote is just bad form. There are two possible situations the $p$ inside is the $p$ outside (that is it's unbound inside the set) in which case you shouldn't write $p\in\mathbb{R}$ since there is no reason for it. $p$ is $1$. Or the $p$ inside the set is bound by an existence quantifier and then you are rebinding it which is also bad form. We can write $\forall p \forall q \exists p P(p,q)$ but we don't. — DRF, Aug 10 '18 at 14:06
Mathematical texts aren't supposed to be machine parsable as a rule. They are supposed to be legible and unambiguous for other researchers. Which is why you stick to reasonable notation and don't do things such as $\forall x(\exists y (\forall x (x\geq y)\wedge (yx=y))$ even if this actually unambiguous and holds in $\mathbb{N}_0$. It's just stupid. If the goal is to say that there exists a zero and that anything times $0$ is $0$ there are better ways to do so and so we use those. — DRF, Aug 10 '18 at 14:17
@MikeEarnest Oh I can see how you COULD parse it another way. Just assume that the whole thing actually stands for $\exists p ((p=1) \wedge\exists S(\forall p,q (((p\in\mathbb R) \wedge (q\in\mathbb{R})\wedge (pq=1))\implies ((p+q,p-q)\in S)))$. It's just really bad form to write things like that so I think most people would assume whoever wrote it was lazy and did a copy paste instead of intentionally rescoping $p$. — DRF, Aug 10 '18 at 14:27
@DRF, if something is not parsable by a Turing Machine, it will not be parsable by any mathematician. — Alexey, Aug 10 '18 at 15:09
@MikeEarnest, the same way as ${,(x, y)\in\mathbf{R}^2,|,x^2 + y^2 = 1,}$ does not necessarily stand for ${(1,0)}$, even if somewhere else in the text $x$ stands for $1$. The point of "$1 < 2$" is to make you think about it. — Alexey, Aug 10 '18 at 15:10
@Alexey That is a very strong view of Turing machines for one. And I didn't say it wasn't parsable by Turing Machine I said that wasn't their goal. The point is that when we write things we assume some foreknowledge. Some of it is explicit and very standardized (There are no zero divisors in the reals.) some of is fairly explicit but not all that standardized (natural numbers include 0), lots of it is implicit (we don't use parentheses for commutative expressions, x,y are variables, a,b are constants). All of these can be occasionally broken and mathematicians deal with it without issue. — DRF, Aug 10 '18 at 15:16
@DRF, so, practically, how do i disambiguate? I am looking to deal with this (in general), and I have an issue, this is why i ask. — Alexey, Aug 10 '18 at 15:23
By the way, this issue of free variables vs. parameters is not just in set builder notation - $\int_1^x x^2,dx$ is another example where some would say the value of that expression is $x^3/3-1$, while others would say that the expression is confusing or even impossible to interpret. In practice, there are many ways to avoid the ambiguity by choosing variables appropriately. — Carl Mummert, Aug 10 '18 at 15:34
@CarlMummert, the value of that integral expression is $x^3/3 - 1$, there is no problem with it, contrary to those in my question. — Alexey, Aug 10 '18 at 15:36
The ambiguity is not in the set-builder notation, it is in the definition of $p$. — Dark Malthorp, Aug 10 '18 at 15:39
@Alexey - see for example https://math.stackexchange.com/questions/109105/limit-of-integration-cant-be-the-same-as-variable-of-integration . Those who view it as malformed usually are thinking of the notation $\int_0^x x^2,dx$ as denoting a particular area on the plane - apparantly the area between $0$ and $x$ horizontally on the $x$ axis and between $0$ and $y = x^2$ vertically. Because that does not make much sense, they view the expression as malformed. — Carl Mummert, Aug 10 '18 at 15:43
You say $p = 1$ and then inside your set $S$ you have the qualifier $p \in \mathbb{R}$, which is meaningless here as it just reduces to "True" because $p$ by your definition is in $\mathbb{R}$, so it might lead a reader to interpret the meaning of the $p$ inside the definition of $S$ as not the same as the $p$ that you defined previously. — Dark Malthorp, Aug 10 '18 at 15:44
In short if $p = 1$, then you don't need the qualification $p \in \mathbb{R}$, so including that adds ambiguity because it is either redundant or a re-definition of $p$ but it is not totally clear which. — Dark Malthorp, Aug 10 '18 at 15:46
@DarkMalthorp, how cat you know then the value of ${,(\cos t,\sin t),|,t\in\mathbf{R},}$ if you do not know the value of $t$? — Alexey, Aug 10 '18 at 15:56
$t$ in ${(\cos t, \sin t) | t\in\mathbb{R}}$ is not a constant with a fixed value. When you write $| t \in \mathbb{R}$, it is taken to mean that the set is the image of $\mathbb{R}$ under the function $(t \to (\cos t, \sin t))$. — Dark Malthorp, Aug 10 '18 at 16:01
@DarkMalthorp, in a completely analogous situation you treated $p$ as a constant, just because somewhere else there was a constant with the same name. Here you say $t$ is not a constant, even though you do not know the context. By the way, isn't "$t\in\mathbf{R}$" just a predicate on $t$? Wouldn't any other predicate in its place make a correct expression? — Alexey, Aug 10 '18 at 16:05
I treated $p$ as a constant because you defined it as a constant in the same sentence. The point is, you have to be careful with notation otherwise things can become ambiguous because of overloading. If it's not clear whether or not $t$ was already defined previously, you might include a statement after defining ${(\cos t, \sin t) | t \in \mathbb{R}}$, saying something like "recall we defined $t$ as..." — Dark Malthorp, Aug 10 '18 at 16:14
But if $t$ is a constant, for the sake of clarity, you shouldn't put in the unnecessary predicate $t \in \mathbb{R}$. The notation is unambiguous, but may be unclear to the reader because sometimes (though it is bad practice) people use the same letters to mean different things — Dark Malthorp, Aug 10 '18 at 16:17
How about unknow constants? An unknown constant is just a quantified variable. I am asking not for an advice on writing style. I am looking for convenient and unambiguous expressions which I would be able to easily manipulate as mathematical objects using a meta-language, for example. (Like a form of lambda-calculus.) I'll think if I could add some more tricky examples where the "informal" set-builder notation "breaks apart". — Alexey, Aug 10 '18 at 16:26
The notation is unambiguous, but becomes ambiguous because sometimes people write ambiguous notation and it becomes ambiguous because it's unclear if you're being unambiguous or being ambiguous. — Dark Malthorp, Aug 10 '18 at 16:35
I guess what I'm trying to say is, if you write $| p \in \mathbb{R}$ it's unclear if you're being sloppy by refining $p$ without explicitly saying so or if you're being confusing by adding in redundant information. — Dark Malthorp, Aug 10 '18 at 16:40

score 4 · Answer 1 · answered Aug 10 '18 at 14:45

4

As I tried to point out in the comments there is no other notation in use, because the notations we have aren't really particularly ambiguous. The example you make would be parsed by pretty much everyone as $S=\{(2,0)\}$ unless the rest of the paper makes this untenable.

There is a simple reason for that and that is that it's very bad form to rescope variables in math. By rescope here I mean we don't write $\forall x\exists y\forall x\cdots$. This is actually well defined as I tried to point out by the example of $\forall x(\exists y(\forall x (x\geq y)\wedge (xy=y)))$. This means the same as $\forall x\exists y\forall z (z\geq y)\wedge (xy=y)$ which is something that's true for natural numbers with 0.

It is very hard to read things like that though and very bad form to write them. Which means that in a given theorem, lemma, etc. the scope of variable and constant names is obvious and they are not reused in strange ways. The ambiguity thus doesn't really arise.

answered Aug 10 '18 at 14:45

DRF

5,167

An explanation for the downvote would be welcome. – DRF Aug 10 '18 at 15:17
Sorry, i give -1 for "The example you make would be parsed by pretty much everyone as $S={(2,0)}$." Then how do you parse ${,(\cos t,\sin t),|,t\in\mathbf{R},}$ without knowing the value of $t$? – Alexey Aug 10 '18 at 15:18
Writing $\forall x\exists y\forall x\cdots$ is correct syntax, and the meaning is well-defined. Doing lambda-calculus with forbidden repetition of variables would be somewhat painful IMO. – Alexey Aug 10 '18 at 15:28
"Which means that in a given theorem, lemma, etc. the scope of variable and constant names is obvious and they are not reused in strange ways." -- what if the theorem itself is a return value of some function I am trying to define, and I need a syntax to define this function (just for example)? – Alexey Aug 10 '18 at 15:34
4

I think that "don't use the same variable for two things at the same time" is a key point that most mathematicians learn by example, although it is not explicitly discussed very often. When people learn formal logic, they are sometimes surprised to find out that nested quantifiers can use the same variable, for example. – Carl Mummert Aug 10 '18 at 15:39
@CarlMummert, I've never learned such thing. Surely nested quantifiers can use the same variable. If I write "let p = 42", then from that point until the end of the proof p is 42, regardless if in the introduction it was a path in the plane or something else, and in general I do not need to keep in mind all previously used letters to write a new statement with bound variables, or reuse letters for new constants. – Alexey Aug 10 '18 at 15:44
1

@Alexey: I would say that is not using the same variable for two things at the same time, just using it for two things at different times. Inside a single proof, it would be somewhat unusual to use $p$ as both a number and a path, even if technically everything is sound. – Carl Mummert Aug 10 '18 at 15:47
1

@CarlMummert, "that is not using the same variable for two things at the same time, just using it for two things at different times" -- just like it is with nested quantifiers: one time (under the inner quantifier scope) i use the variable for one thing, the other time (outside the inner scope) i use it for other things. – Alexey Aug 10 '18 at 15:48
1

@DRF The term you're looking for for "rescoped" is "shadowed". – Derek Elkins left SE Aug 11 '18 at 18:49
@CarlMummert Mathematicians and educators of mathematics fail to explicitly address issues of scoping? That's unpossible! – Derek Elkins left SE Aug 11 '18 at 18:53
"The ambiguity thus doesn't really arise." -- well, it arsed once I tried to build more complex expressions with nested set-builders. – Alexey Sep 29 '18 at 21:08

score 3 · Answer 2 · answered Aug 10 '18 at 15:32

An interesting point about set-builder notation is that it is used as part of informal set theory. Formal set theories such as ZFC do not include set-builder notation, and instead use the comprehension scheme which defines a set using a formula of set theory. Set builder notation is, in some sense, just an abbreviation for the comprehension scheme. It's not meant to be a formal language or a programming language, just a way to convey the definition of a set to another mathematician.

This means that set-builder notation is more of a "natural language" than a formal language. Issues such a free vs. bound variables, or variables vs. parameters, are handled by convention, just like many other issues in ordinary mathematics. So, just as in other areas of ordinary mathematics, it is possible to write vague definitions, or definitions that can be read in multiple ways. Mathematicians avoid that hazard in natural language all the time, not only with set builder notation.

I understand your answer as "there is not standard unambiguous notation of this kind". Then I am left with having to invent my own one, I am afraid... — Alexey, Aug 10 '18 at 15:39
I think so, if you want something that has unambiguous parsing rules that could be programmed. — Carl Mummert, Aug 10 '18 at 15:45
I do not intend to program them yet, i just need to be able to read back what i wrote. — Alexey, Aug 10 '18 at 15:47

score 3 · Answer 3 · edited Aug 10 '18 at 16:49

3

Here is my summary of the two notations at play here. \begin{array}{} \{x\in S\mid\psi(x,t_1,t_2,\dots,t_n)\} & & \text{The subset of $S$ satisfying property $\psi$}\\ \{f(x,t_1,\dots,t_n)\mid x\in S\} & & \text{The image of $S$ under $f$} \end{array} In both examples, the $t_i$ are variables which could be defined elsewhere, and every definition gives rise to a different set. However, the variable $x$ plays a special role; it is used as a placeholder to specify a generic element of $S$.

The above two notations are unambiguous. The ambiguity comes when we mix the two notations. $$ \{f(x,t_1,\dots,t_n)\mid x\in S,\psi(x,s_1,\dots,s_m)\}\qquad \text{The image of the subset of $S$ satisfying $\psi$ under $f$.} $$ The problem is that the statement $\psi$ might contain a clause like "$t_i\in T$," so it makes $t_i$ look like the "special" placeholder variable, and it is unclear whether $x$ or $t_i$ or both is supposed to fulfill this role. Your example was $$ \{f(p,q)\mid p\in \mathbb R,q\in \mathbb R,pq=1,1<2\} $$ where $f(p,q)=(p+q,p-q)$. Here, it is unclear whether $p$ and $q$ are both supposed to range freely over $\mathbb{R}$, or $p$ is supposed to be a fixed value while $q$ ranges over $\mathbb R$, or vice versa.

There is a simple way to avoid this ambiguity:

Assume that any variable $x$ which appears on both sides of the divider $\mid$, and which appears on the right side in the form of a clause $x\in S$ for some set $S$, is intended only to be interpreted of the context of this notation as a variable freely ranging over $S$.

Under this rule, the set $S$ you asked about would mean the second interpretation. I realize this is the opposite of what I strongly said in my comment, but I think there is enough precedent in math for having "locally" bound variables which do not interfere with other instances. For example, you could write "$i=5$ and $n=\sum_{i=1}^{5} i$", and most people would agree that $n=15$, that is, $i$ is not equal to $5$ at each point in the summation.

edited Aug 10 '18 at 16:49

Eric Wofsey

330,363

answered Aug 10 '18 at 16:04

Mike Earnest

75,930

1

This is not the issue. ${f(p,q)\mid p\in \mathbb R,q\in \mathbb R,pq=1,1<2}$ is totally unambiguous in isolation, and only becomes ambiguous when written in a context where $p$ has been given an external definition. – Eric Wofsey Aug 10 '18 at 16:12
@EricWofsey, "becomes ambiguous when written in a context where ..." means ambiguous IMO, because the "complete context" can be usually viewed as unknown. (Think of a lecturer who said at the first lecture that in the course $p$ will mean $42$, and of a student who comes later and sees ${(p, q)|0\le p + q\le100}$.) – Alexey Aug 10 '18 at 16:18
@EricWofsey I would argue $S={f(p,q)\mid p\in \mathbb R,q\in \mathbb R}$ is ambiguous even without an external definition of $p$. Do you mean the image of $f$ applied to all of $\mathbb R^2$, or do you mean the image of $f$ applied to the vertical line ${(p,q)\mid q\in \mathbb R}$, where $p$ is some value to be determined later? In the latter case, the contents of $S$ would be a function of $p$. – Mike Earnest Aug 10 '18 at 16:24
@MikeEarnest: That's just wrong, if you are talking about ordinary mathematical usage. No mathematician would ever write $S={f(p,q)\mid p\in \mathbb R,q\in \mathbb R}$ with your second meaning. The correct notation for that meaning would be, as you wrote, ${(p,q)\mid q\in\mathbb{R}}$. – Eric Wofsey Aug 10 '18 at 16:25
The reason for this is subtle: for instance, if you instead had $S={f(p,q)\mid q\in\mathbb{R}, p^2+q^2<1}$ then I would agree that it is possible that $p$ is an external parameter (and in fact that would be the more usual interpretation). The difference is that the condition $p\in\mathbb{R}$ depends only on $p$ and not on any other variables, so it would be redundant to include it if $p$ were externally defined. So, it is taken as a signal that $p$ must be a bound variable. – Eric Wofsey Aug 10 '18 at 16:33
@EricWofsey I think we still agree. In my post I concluded ${f(p,q)\mid p\in \mathbb R,q\in \mathbb R}$ should be interpreted the way any sensible mathematician would mean it. Also, I did address the issue about what to do when $p$ is defined externally; if $p$ appears in the form $p\in S$ to the right of the divider, you should ignore its external definition and treat it as ranging over $S$. Otherwise, use the external definition. – Mike Earnest Aug 10 '18 at 16:40
OK, I think I misinterpreted the thrust of your answer a bit. I still feel that your description of where the ambiguity comes from is a bit misleading. For instance, I would say that ${x\in S\mid t\in T}$, while not particularly ambiguous (though it could be ambiguous if one could plausibly imagine that $t$ is implicitly supposed to be defined in terms of $x$ in a way specified elsewhere), is very strange notation that is likely to confuse your reader. Here you're just using the first notation, but it is still confusing because you are including redundant information. – Eric Wofsey Aug 10 '18 at 16:56
2

So, the confusion in OP's example really comes from the fact that whichever way you interpret it, they must be violating some rule of mathematical writing: either they are using a variable with two different meanings at the same time, or they are including a strangely redundant condition. I agree with your ultimate diagnosis that the first violation is less serious (as in your example with the summation) and so the second interpretation is "correct", but really the correct answer is to just not write such a thing at all. – Eric Wofsey Aug 10 '18 at 16:58
2

@EricWofsey I agree with your points about how ambiguity would not arise if you write well. These are strange corner cases that would only arise if a machine was trying to interpret human written math for some strange reason. – Mike Earnest Aug 10 '18 at 17:02
"[...] which appears on the right side in the form of a clause $x\in S$ [...]" -- this is what i meant by an "artificial syntactic conventions" (IMO): what if the right-hand side is written as just "$\Phi(x,y,z)$," where the predicate $\Phi$ does not have to depend on all 3 of its formal parameters, and where $\Phi$ itself may be unknown (for example, it may be a return value of some function)? Also, if the left hand side is written "$f(x,y,z)$," this does not mean that all 3 variables are "used" (it is possible that $f(x,y,z) = 42$ for all $x$, $y$, $z$). – Alexey Aug 12 '18 at 07:54

score 1 · Answer 4 · answered Dec 21 '22 at 19:01

You can avoid the ambiguity in standard set-builder notation by falling back to "single-variable comprehension" whenever the LHS contains variables that are not bound in the set-builder expression. The downside is that the paraphrase in the presence of free variables makes the expression longer and requires some explicit existential quantifiers.

First, note that using a single variable as the term to the left of the bar can express every set comprehension. For example $\{ z \mathop| \exists x \exists y \mathop. z = (x, y) \land (x, y) \in \mathbb{R}^2 \land x^2 + y^2 = 1 \}$ instead of $\{ (x, y) \in \mathbb{R}^2 \mathop| x^2 + y^2 = 1 \}$. You can explicitly say that when you use this notation that variable appearing in $z$'s position is always bound and never free.

This fallback option is not convenient, but it lets you reserve the abbreviated version for cases where all variable symbols in the term to the left of the bar are bound.

For example, in $\{ (p+q, p-q) \mathop| pq=1 \land p \in \mathbb{R} \land q \in \mathbb{R} \}$, both $p$ and $q$ would unambiguously be bound.

To make $p$ free, you would use $\{ z \mathop| \exists q \mathop. z = (p, q) \land p \in \mathbb{R} \land q \in \mathbb{R}\} $.

I think this convention is simple and standard enough that it could be used in writing without explaining it, which would not be true for a variant of the comprehension notation with an explicit way of annotating bound variables.

Unambiguous set abstraction (set-builder) notation with parameters

4 Answers4

Linked