We write it in terms of the number of bits because it accurately models the reality of how computers work.
Computers don't natively support addition of arbitrarily sized numbers in 1 time step. Instead, they break such operations into addition of constantly-sized numbers (say bits for simplicity), and then use these as a basis for some algorithm to add larger numbers. Essentially, to measure the time complexity of some algorithm, we need to state what operations "cost" 1 time step, and any choice we make should be informed by how computers are built in practice.
You're also slightly misunderstanding the complexity of sorting. Things like quicksort state that you can sort an $n$-element array using $O(n\log n)$ comparisons. But how much does each comparison cost? If all elements in the array are assumed to be constant-sized, this is $O(1)$, so the overall time complexity is $O(n\log n)$.
But what if the size of elements in the array grows with the overall array? If you have an array of $n$ elements, each of which is an $n$-bit number, quicksort still works in $O(n\log n)$ comparisons. How much does each comparison cost though? In terms of bit operations, it should be $O(n)$ time. So one has that the complexity of sorting this array is:
$$
O(n\log n) \text{ comparisons} \times O(n) \frac{\text{time}}{\text{comparison}} = O(n^2\log n)\text{ time}
$$
So sorting this array is actually $O(n^2\log n)$ time.
In your example, you're right that it takes $O(p)$ trials. If $p$ is $2^k$ bits, why should this be "exponential"?
Consider the very basic part of your algorithm, which is setting some counter $i = 0$, and incrementing it all the way up to $p$. How many bit operations does this simple part of the algorithm take? While this sounds like a dumb question, actually designing an algorithm and evaluating its cost (in terms of bit operations) may be useful for your understanding.