2

The "Algorithmic Foundations of Differential Privacy" book (DOI: 10.1561/0400000042) introduces formally the "universe" and "database" on page 17 roughly as:

  • $\mathcal{X}$ is a universe
  • databases $x$ are collections of records from the universe
  • For convenience, we use histogram of types from the universe $\mathcal{X}$ to represent $x$, such that: $x \in \mathbb{N}^{|\mathcal{X}|}$ where each entry $x_i$ represents the number of elements in the database $x$ of type $i \in \mathcal{X}$

If you take the example from Wikipedia

Table

  • The universe $\mathcal{X}$ is a set $\{0, 1\}$?
  • The database $x$ is
    • a vector [3, 3] (assuming the universe is ordered)?
    • or a map {0:3, 1:3}?

My two questions are:

  • Is my understanding correct?
  • Why is it "convenient" to do so? What would be the non-convenient alternatives?
John Doe
  • 155
  • 4

1 Answers1

0

I am a little late. Your understanding seems correct. $\mathcal{X}=Names \times \{0,1\}$ Where $Names$ are all the possible names that could exist and $x \in \mathbb{N}^{2}$ (Because of $|\mathcal{X}|=2$).

Then you can choose one column and select the histogram of the type you want. Then you can express $x=(x_0,x_1)=(3,3)$ or $x=(x_{ross},...,x_{Rachel})=(1,..,1)$. A you see the first one is more convenient.