1

I have the following Python code that I want to express mathematically.

W = 0
for i in range(5):
    if (A > i_1) and ( (A < i_2) or (A < i_3) ):
        W = W + 1

This is a generic problem that I am trying to achieve; looping over some values and checking some condition. If the condition is satisfied, I want to add 1 to the counter (W in this case); otherwise, skip to the next value. How can I express this using a mathematical formula?

To be more specific:

for row in DF1.iterrows():
    pTime = row['A1_time']
    W = 0
for row2 in DF1.iterrows():
    a1 = row2['A1_time']
    a2 = row2['A2_time']
    a3 = row2['A3_time']

    if (pTime &gt; a1) and ( (pTime &lt; a2) or (pTime &lt; a3) ):
        W = W + 1

  • 2
    It will be either 5 or 0 depending on if your condition is met or not. Did you intend the index i to change something in your conditional statement? – rschwieb Jul 26 '20 at 05:55
  • The actual loop is iterating over rows in a dataframe. To simplify the question, for i in range (5) represents looping over some values. I will correct the code to be more clear. – O.Mohsen Jul 26 '20 at 05:58
  • 2
    Unclear the same way, how does i_1, i_2, i_3, A depend on i? And sum(1 for i in range(5) if (A > i_1) and ( (A < i_2) or (A < i_3) )) could be more pythonic. – Alexey Burdin Jul 26 '20 at 06:01
  • I still think the answer I wrote so far applies. And also it’s pretty unpythonic to explicitly loop that way. You would get the same thing if you did a set comprehension with the rows of the data frame filtering on your predicate, and just used ‘len’ on the result. – rschwieb Jul 26 '20 at 06:03
  • Loops aren’t mathematical in themselves: they’re more procedural. If your thing is a more complicated algorithm then it might matter but right now it looks irrelevant. – rschwieb Jul 26 '20 at 06:05
  • I added to the code a 'simplified' example of the actual problem. I hope it makes it more clear. What I am trying to do is loop over each row in the dataframe. Each time, I will compare the values of one row to all other rows, and IF the value of the current row is less than the value of any row, I increment the counter. – O.Mohsen Jul 26 '20 at 06:20
  • @O.Mohsen So why do some of your inequalities run the other way? – J.G. Jul 26 '20 at 15:53
  • @J.G. These are the conditions of the problem at hand. – O.Mohsen Jul 28 '20 at 03:32

1 Answers1

1

To me it looks like you are just computing the cardinality of the set of elements meeting your conditions.

So something like $|\{x : P(x)\}|$ where $P(x)$ means $x $ satisfies your condition.

I can’t be totally sure how to elaborate until you clarify what the index has to do with the inner conditions.


Borrowing a bit of python syntax and using it with set notation, what I'm saying holds above for your latest example:

$|\{(x,y)\in DF1\times DF1: x.a1time > y.a1time \text{ and } x.a1time < max(y.a2time, y.a3time)\}|$

I was able to simplify the OR condition slightly by using max. In fact, this is far closer to how I would do it with real python:

len([(x,y) for x in DF1.iterrows() for y in DF1.iterrows() 
     if x['a1_time'] > y['a1_time'] and x['a1_time'] < max(y['a2_time'],y['a3_time']))])

It can also be compressed a little bit more by using things from the itertools library, but this is more elementary python.

Since it looks like you don't care about the actual entries, just their count, another trick is to just create a list of $1$'s indicating each successful hit, then using sum. This might be a good alternative because you can do it all with generator expressions, conserving memory.

from itertools import product
sum(1 for x,y in product(DF1.iterrows(), DF1.iterrows())
     if x['a1_time'] > y['a1_time'] and x['a1_time'] < max(y['a2_time'],y['a3_time'])))
rschwieb
  • 153,510
  • I added to the code a 'simplified' example of the actual problem. I hope it makes it more clear. What I am trying to do is loop over each row in the dataframe. Each time, I will compare the values of one row to all other rows, and IF the value of the current row is less than the value of any row, I increment the counter by 1. – O.Mohsen Jul 26 '20 at 06:23
  • @O.Mohsen Answer still doesn't change. I added how to use what I said for the newest example. – rschwieb Jul 26 '20 at 13:12
  • what you mentioned here is useful |{(x,y)∈DF1×DF1:x.a1time>y.a1time and x.a1time<max(y.a2time,y.a3time)}| – O.Mohsen Jul 28 '20 at 03:35
  • In a similar manner, how can I include (mathematical notation) a condition where I need to increment a variable, say L = L + someValue, instead of a counter? This someValue is another attribute in the dataframe DF1. – O.Mohsen Jul 28 '20 at 03:38
  • 1
    @O.Mohsen I would just define a function $L$ such that $L(x)$ is the increment you want, and then say $\sum {L(x):x\in X \text { and } P(x) \text{ holds}}$ where $P$ is whatever predicate you want $x$ to satisfy. – rschwieb Jul 28 '20 at 14:32