2

I need to sort a lot of rows (from 1GB to 3GB) by EpochTime (a single value of every row).

What is the fastest in-place sorting algorithm for this task? Radix Sort?

I would like the fastest sorting algorithm but I think that I could fail on limit of memory so I would prefer an in-place algorithm.

EpochTime should be unique (I didn't check). The Max and Min values are about a range of 4 hours, so they're pretty similar.

I'm asking this because I think there could be some optimization for some sorting alghoritms, also because epochtimes are pretty much known. I checked for Counting Sort (after all I'm talking about integers), but didnt' look like best case.

Tizianoreica
  • 129
  • 3
  • Ok, what about minimal difference of values? This values are not equidistributed? (the same difference between them?) 1-3 GB is whole data or timestamps only? – Evil Apr 12 '16 at 16:22
  • 3
    What research have you done? There is lots written about sorting algorithms. Have you read Wikipedia? Have you browsed through [tag:sorting] on this site? This question seems like a duplicate of http://cs.stackexchange.com/q/18536/755 -- is there any reason why what you are asking is different? If so, you should edit the question. We expect you to do a significant amount of research before asking, and show us your research in the question. – D.W. Apr 12 '16 at 17:20
  • Can we assume the data fits into RAM? Then many algorithms apply. If not, things become more interesting (cc @D.W. -- the question you link is for in-memory only, I guess). – Raphael Apr 13 '16 at 12:08
  • Yes @Raphael. We can assume all data fits into ram, but in-place. I think if we start to use support structure (like Radix Sort does) It could fail for memory limits – Tizianoreica Apr 13 '16 at 12:16
  • 1
    So, what's wrong with e.g. Quicksort? If you take care of worst-case events (cf literature), it takes only logarithmically additional memory for the stack. – Raphael Apr 13 '16 at 12:48
  • @Raphael you're right, but RadixSort could be linear or at least O(n*lgk) and with k much less than n, it helps a lot. – Tizianoreica Apr 13 '16 at 15:15
  • 2
    The "linear" running time of Radixsort is a lie. The running time of radixsort is proportional to the number of items, times the bitwidth per item: $O(nb)$. In realistic problems, $b$ is comparable to $\lg n$, so this is not a big asymptotic win over Quicksort's $O(n \lg n)$. – D.W. Apr 13 '16 at 19:25

0 Answers0