|  | Home | Libraries | People | FAQ | More | 
          The spreadsort
          flash_sort (another hybrid
          algorithm), by comparison is 𝑶(N) for evenly distributed
          lists. The problem is, flash_sort
          is merely an MSD radix
          sort combined with 𝑶(N*N) insertion sort to
          deal with small subsets where the MSD Radix Sort is inefficient, so it
          is inefficient with chunks of data around the size at which it switches
          to insertion_sort, and
          ends up operating as an enhanced MSD Radix Sort. For uneven distributions
          this makes it especially inefficient.
        
          integer_sortfloat_sortflash_sort's
          𝑶(N) performance for even distributions comes at the
          cost of cache misses, which on modern architectures are extremely expensive,
          and in testing on modern systems ends up being slower than cutting up the
          data in multiple, cache-friendly steps. Also worth noting is that on most
          modern computers, log2(available
          RAM)/log2(L1 cache
          size)
          is around 3, where a cache miss takes more than 3 times as long as an in-cache
          random-access, and the size of max_splits is tuned
          to the size of the cache. On a computer where cache misses aren't this
          expensive, max_splits could be increased to a large
          value, or eliminated entirely, and integer_sort/float_sort
          would have the same 𝑶(N) performance on even distributions.
        
          Adaptive Left Radix (ALR) is similar to flash_sort,
          but more cache-friendly. It still uses insertion_sort. Because ALR uses
          𝑶(N*N) insertion_sort,
          it isn't efficient to use the comparison-based fallback sort on large lists,
          and if the data is clustered in small chunks just over the fallback size
          with a few outliers, radix-based sorting iterates many times doing little
          sorting with high overhead. Asymptotically, ALR is still 𝑶(N*log(K/S
          + S)), but with a very small S (about 2
          in the worst case), which compares unfavorably with the 11 default value
          of max_splits for Spreadsort.
        
          ALR also does not have the 𝑶(N*log(N)) fallback, so
          for small lists that are not evenly distributed it is extremely inefficient.
          See the alrbreaker and
          binaryalrbreaker testcases
          for examples; either replace the call to sort with a call to ALR and update
          the ALR_THRESHOLD at the top, or as a quick comparison make get_max_count return
          ALR_THRESHOLD (20 by default
          based upon the paper). These small tests take 4-10 times as long with ALR
          as std::sort
          in the author's testing, depending on the test system, because they are
          trying to sort a highly uneven distribution. Normal Spreadsort does much
          better with them, because get_max_count
          is designed around minimizing worst-case runtime.
        
          burst_sort is an efficient
          hybrid algorithm for strings that uses substantial additional memory.
        
          string_sortstring_sort
          postal_sort and string_sortpostal_sort
          was not found in a search for source.
        
          string_sort
          Another difference is not applying the stack-size restriction. Because
          of the equality check in string_sortstring_sortstring_sortstd::strings)
          had comparable runtime to introsort,
          but making a hybrid of the two allows reduced overhead and substantially
          superior performance.