I am trying to learn how to model programming problems in a mathematical way. I am a software engineer, but have recently been running into road blocks where I can't solve some problems very efficiently because I can't reason about it using mathematical thinking. So this question is coming from that perspective: I want to learn how to formulate programming problems mathematically, so I can start manipulating them using set theory and stuff like that.
One problem I run into a lot is in modeling data. Here is currently the way I describe it to myself:
- You have a single database schema.
- There are a set of models in that schema (or database tables).
- Each model/table has a set of attributes (or for a table, columns).
- Some attributes might be relations. That is, a model has "simple" attributes and "relational" attributes. The relations just point to other models.
So a specific example might be a blog:
- A blog schema might have 3 models: user, article, comment
- The user model might have these attributes: firstName, lastName, email
- The article model might have attributes: title, body, userId
- The comment model might have attributes: message, articleId
The "simple" attributes are ones like email/firstName/title/etc. The "relational" attributes are ones that point to other models, like userId
in the article model, or articleId
in the comment model.
In trying to wrap my head around what's actually happening, I try to think about this in terms of sets, because in the end, the models/attributes in a schema form a graph. But I don't know how to formulate these sets properly into equations. Without being able to think like that, I am unable to easily reason about things like graph traversals and shortest paths and such. I don't know how to model the above description/list into a mathematical formulation. Which is the reason for asking. How would you model this in a mathematical way?
Here is what I've tried so far:
The set of all attributes on a model is defined as $a(m)$. The set of all models in a schema is defined as $M(s)$, where $s$ is a single schema. So then you could say, $m∈M(s)$. (Is this how you would would formulate it as well, or what am I doing wrong?)
Then you can say, the set of all attributes in a schema $s$ is defined as $a(s) = \bigcup_{m\in M(s)} a(m)$. Is that how you would write that, or is there a different way? Maybe instead of $a(s)$ you would do $a_s$? I don't know.
Then you could say, the set of relations on a model is a subset of the attributes on a model: $r(m) ⊆ a(m)$. And so, the set of all relations in a schema is $r(s) = \bigcup_{m\in M(s)} r(m)$.
Is this correct? Or what is the correct/preferred way to formulate this? I am not looking for a proof or anything, but just how you would formulate this using mathematical symbols. Any tips would be greatly appreciated :)