blob: b30b4a0e0dab62564dc9a46e97baee375b666b6b [file] [log] [blame]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Hybrid Radix</title>
<link rel="stylesheet" href="../../../../../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.78.1">
<link rel="home" href="../../../index.html" title="Boost.Sort">
<link rel="up" href="../rationale.html" title="Rationale">
<link rel="prev" href="../rationale.html" title="Rationale">
<link rel="next" href="why_spreadsort.html" title="Why spreadsort?">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../rationale.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../rationale.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="why_spreadsort.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="sort.sort_hpp.rationale.hybrid_radix"></a><a class="link" href="hybrid_radix.html" title="Hybrid Radix">Hybrid Radix</a>
</h4></div></div></div>
<p>
There a two primary types of radix-based sorting:
</p>
<p>
Most-significant-digit Radix sorting (MSD) divides the data recursively
based upon the top-most unsorted bits. This approach is efficient for even
distributions that divide nicely, and can be done in-place (limited additional
memory used). There is substantial constant overhead for each iteration
to deal with the splitting structure. The algorithms provided here use
MSD Radix Sort for their radix-sorting portion. The main disadvantage of
MSD Radix sorting is that when the data is cut up into small pieces, the
overhead for additional recursive calls starts to dominate runtime, and
this makes worst-case behavior substantially worse than <span class="emphasis"><em>&#119926;(N*log(N))</em></span>.
</p>
<p>
By contrast, <code class="literal"><code class="computeroutput"><a class="link" href="../../../boost/sort/spreadsort/integer_sort_idp41299456.html" title="Function template integer_sort">integer_sort</a></code></code>,
<code class="literal"><code class="computeroutput"><a class="link" href="../../../boost/sort/spreadsort/float_sort_idp47034528.html" title="Function template float_sort">float_sort</a></code></code>,
and <code class="literal"><code class="computeroutput"><a class="link" href="../../../boost/sort/spreadsort/string_sort_idp48004640.html" title="Function template string_sort">string_sort</a></code></code>
all check to see whether Radix-based or Comparison-based sorting have better
worst-case runtime, and make the appropriate recursive call. Because Comparison-based
sorting algorithms are efficient on small pieces, the tendency of MSD
<a href="http://en.wikipedia.org/wiki/Radix_sort" target="_top">radix sort</a>
to cut the problem up small is turned into an advantage by these hybrid
sorts. It is hard to conceive of a common usage case where pure MSD <a href="http://en.wikipedia.org/wiki/Radix_sort" target="_top">radix sort</a> would
have any significant advantage over hybrid algorithms.
</p>
<p>
Least-significant-digit <a href="http://en.wikipedia.org/wiki/Radix_sort" target="_top">radix
sort</a> (LSD) sorts based upon the least-significant bits first. This
requires a complete copy of the data being sorted, using substantial additional
memory. The main advantage of LSD Radix Sort is that aside from some constant
overhead and the memory allocation, it uses a fixed amount of time per
element to sort, regardless of distribution or size of the list. This amount
of time is proportional to the length of the radix. The other advantage
of LSD Radix Sort is that it is a stable sorting algorithm, so elements
with the same key will retain their original order.
</p>
<p>
One disadvantage is that LSD Radix Sort uses the same amount of time to
sort "easy" sorting problems as "hard" sorting problems,
and this time spent may end up being greater than an efficient <span class="emphasis"><em>&#119926;(N*log(N))</em></span>
algorithm such as <a href="http://en.wikipedia.org/wiki/Introsort" target="_top">introsort</a>
spends sorting "hard" problems on large data sets, depending
on the length of the datatype, and relative speed of comparisons, memory
allocation, and random accesses.
</p>
<p>
The other main disadvantage of LSD Radix Sort is its memory overhead. It's
only faster for large data sets, but large data sets use significant memory,
which LSD Radix Sort needs to duplicate. LSD Radix Sort doesn't make sense
for items of variable length, such as strings; it could be implemented
by starting at the end of the longest element, but would be extremely inefficient.
</p>
<p>
All that said, there are places where LSD Radix Sort is the appropriate
and fastest solution, so it would be appropriate to create a templated
LSD Radix Sort similar to <code class="literal"><code class="computeroutput"><a class="link" href="../../../boost/sort/spreadsort/integer_sort_idp41299456.html" title="Function template integer_sort">integer_sort</a></code></code>
and <code class="literal"><code class="computeroutput"><a class="link" href="../../../boost/sort/spreadsort/float_sort_idp47034528.html" title="Function template float_sort">float_sort</a></code></code>.
This would be most appropriate in cases where comparisons are extremely
slow.
</p>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2014 Steven Ross<p>
Distributed under the <a href="http://boost.org/LICENSE_1_0.txt" target="_top">Boost
Software License, Version 1.0</a>.
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../rationale.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../rationale.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="why_spreadsort.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>