Simplifying the deleted question: it's given two large distinct prefix bytestrings $p_0$ and $p_1$, and it's asked two suffixes $s_0$ and $s_1$ so that $H(p_0\mathbin\|s_0)=H(p_1\mathbin\|s_1)$ for some $b=80$-bit cryptographic hash $H$ (obtained by truncation SHA-3, but that's immaterial). I concluded
The number of hashes to compute is in the order of $2^{42}$.
The standard technique for this is to define a function
$$\begin{align}f:\{0,1\}^b&\to\{0,1\}^b\\
x\quad&\mapsto\begin{cases}H(p_0\mathbin\|r_0\mathbin\|x)&\text{if the low-order bit of }x\text{ is }0\\
H(p_1\mathbin\|r_1\mathbin\|x)&\text{otherwise}\end{cases}\end{align}$$
where $r_0$ and $r_1$ are short fixed bytestrings of length chosen to align $x$ in a way making multiple evaluations of $f$ as fast as possible.
Observe that if we get $x_0$ and $x_1$ differing in their low-order bit and with $f(x_0)=f(x_1)$, that yields the desired suffixes $s_0=r_0\mathbin\|x_0$ and $s_1=r_1\mathbin\|x_1$.
If we evaluated $f$ for incremental inputs $x$ and looked for all exploitable collisions, for $2n$ hashes ($n$ with each value of the low-order bit), the probability of collision is slightly below $1-(1-2^b)^{n^2}$. Thus for $n=2^{b/2}$, that is $2^{b/2+1}$ hashes, and $b$ of cryptographic interest, the probability is about 63%. It's only 22% for $2^{b/2}$ hashes. That's lower than in pure hash collisions for a given number of hashes, because collisions between inputs with the same low-order bit are worthless, and about half of collisions fall into that category.
We can now answer
shouldn't it be $2^{40}$ considering the root of $2^{80}$ ?
by: No. Even with the best possible collision-detection technique, it's rather unlikely we can solve the problem with $2^{40}$ hashes. We can be cautiously optimistic with $2^{41}$.
Further, that exact strategy would require $O(2^{b/2}\,b)$ memory, which is impractically much; and a lot of its execution time would be search in that memory. The simplest solution to that issue is Floyd's cycle-finding, which reduces the memory requirement to a negligible $O(b)$, but markedly increases the expected number of evaluations of $f$. I don't know the exact factor, which depends on strategy; perhaps $3$, which would put us above $2^{42}$.
There are a number of improvements to cycle-finding. One key idea is distinguished points: we define $g$ iterating $f$ until the output has some characteristic (e.g. has the top 24 bits at zero), and try to find collisions in $g$ for inputs with that characteristic. When we get one, there remains only little work to find a collision for $f$, and it has 50% chance to be usable. That enables a time/memory trade-off which will get us to roughly $2^{41.5}$.
There's also the issue of distributing the work on several machines or execution units of a GPU, which would be a necessity is we use SHA-3 and want the result fast. The trade-off becomes between work and communication, and sometime we are willing to make sizably more hashes than the minimum, say in the order fo $2^{b/2+2}$. The standard reference is Paul C. van Oorschot & Michael J. Wiener's Parallel Collision Search with Cryptanalytic Applications, in JOC, 1999.