Tutorials Books Videos Forums

Change the theme! Search!
Rambo ftw!

Customize Theme


Color

Background


Done

Table of Contents

Radix Sort

by kirupa   |   filed under Data Structures and Algorithms

Earlier, we looked at Counting Sort, our first non-comparative sorting algorithm that doesn't use comparisons to figure out how to sort values. While counting sort runs in linear time under many conditions, it suffers from one major problem. Counting sort is hugely impractical when the range of the input number (k) is significantly larger than the number of elements (n). For example, imagine that we are sorting only two numbers, 1 and 1,000,000. Our n, in this case, is 2, for we are just sorting two items. The k will be on the order of one million. Loosely visualized, this means that counting sort will create and iterate over intermediate arrays that span a million values for sorting two items:

We can see how this is hugely problematic in the many situations where our smallest and largest numbers aren't within a reasonably narrow range of each other. To address this problem, we have radix sort. Radix sort is another non-comparative sorting algorithm that uses a different technique for sorting numbers. This allows radix sort to easily bypass the sorts of problems that counting sort would face when sorting numbers that are really far apart. Sounds neat, right? In the following sections, we'll walk through how radix sort will sort numbers and dive deep into the details of how it works, the JavaScript code for bringing it to life, and wrap things up with a look at its performance characteristics.

Onwards!

How Radix Sort Works

To understand how radix sort works, we'll start with an unsorted collection of numbers and walk through how we can sort them. Note that the collection has to be numbers (no strings, objects, emojis, etc.) that are in integer form. With that constraint in mind, our starting numbers are as follows:

The first thing we are doing to do is find the largest number that we need to sort. For our example, that would be 2310:

What we are doing with this value is trying to find out how many digits it has. For 2310, the number of digits would be four:

This is an important detail, for the way radix sort works heavily revolves around the number of digits present in the largest number. If we had to describe radix sort's behavior, it would be:

  1. Find the maximum number of digits in the largest number that needs to be sorted.
  2. Start with the least significant digit (the rightmost digit) and sort the numbers based on that digit.
  3. Move to the next digit place (tens, hundreds, etc.) and sort the array based on that digit while maintaining the order from the previous step.
  4. Repeat the process until you have sorted based on the most significant digit (the leftmost digit). This digit would correspond to the highest digit found in step 1.

If these three steps aren't totally clear, don't worry. We are going to walk through them and look at how this all ties together. Our starting point is right after we determined that our largest number (2310) has four digits.

Sorting by the Ones (Units) Digit

We start our sort with the least significant (aka rightmost) digit, and that is going to be our ones or units digit. Let's go ahead and mark those numbers:

What we do next is sort all of our items based on the value from the ones digit. This would cause the items in our array to be rearranged as follows:

The detail to keep in mind is that all of our items in our array aren't sorted in their final positions. They are only sorted with respect to the ones digit, and that is the sorting order we see currently.

Sorting by the Tens Digit

With our values sorted by their ones (units) digit, we keep moving left to sort the values. We are now about to sort by the tens digit:

Just like earlier, we shift all of our items around until they are sorted using their tens digit value as the primary value:

The end result is another round of sorting. Can you guess what we are going to be doing next?

Sorting by the Hundreds Digit

Yes, we will keep going left until we run out of digits to sort. Earlier, we calculated the maximum number of digits by looking at our largest value (2310), and we know that it has four digits all the way up to the thousands position. We are not there yet, for we are now going to sort by the hundreds (aka third least significant digit) next:

Notice that some of the numbers we wish to sort do not have a value for the hundreds position. That is OK. We'll just treat those empty values as having a 0 present there instead. It's time to get sortin':

One thing to observe is that all of the values that don't have a hundreds place jump to the beginning while still maintaining their relative order. This is done to ensure stability. After all of this, we just wrapped up sorting all of our values by their hundreds position. All that is left is to...

Sorting by the Thousands Digit

Our last step is to sort by the thousands digit, which is the final digit as determined by our max value calculation earlier:

We only have one number with a thousands digit, and that is the 2 in 2310. All of the rest of the numbers have a thousands digit value of 0, for they don't actually have a thousands value. Let's go ahead and perform one last sorting operation:

And with this step, we are done. Notice the final ordering of our numbers. They are all properly sorted from smallest to largest. There is an important detail with the sorting here that we glossed over. We'll revisit that next.

It's Counting Sort Time...Again!

In our quest to explain how radix sort works and how it sorts each digit from the least significant digit to the most significant digit, we skipped over the important question of How does the actual sorting of each digit work?

It works by using, now...wait for it, counting sort:

Yes, the same counting sort that we looked at earlier with two small twists.

For the first twist, our particular radix sort implementation works with numbers. More specifically, decimal (aka base-10) numbers. This means that the range of values we sort for each digit goes from 0 to 9. With counting sort, traditionally, the range of values goes from 0 to the maximum value in the collection. In our case, we keep things simple and just say that, for all of our radix sort-related sorting, we will always have our sorting buckets be from 0 to 9.

The second twist is...really twisted! We are sorting by only one digit in the full value (ie: the ones digit or tens digit), but it is the full value that gets shifted around. This will be clearer when we walk through our example in greater detail.

To explain this further, let's say we are sorting our ones digit values:

Our counting sort setup would initially look like this:

Notice that our input is the full set of numbers, not just the ones digit. Our count array has its index positions going from 0 to 9 to correspond to all the digits in our base-10 decimal world. Next, if we do a count of each ones digit and tally the occurrence, our count array will have these values:

Next, we calculate the prefix sum to help us calculate the final position of our input values based on the order of the items in the ones digit:

From this point onwards, we follow through with placing the values in their correct positions in our output array. We iterate backwards through our input items, adjust the value in count to ensure the correct index positions are reflected, and build our output array with the final set of values:

After counting sort has run, the final output is sent back to radix sort where it will kick all of this off again. The difference this time around is that the digit/place we sort from will be the next higher significant digit, which will be the tens place.

That the intermediate count array is representative of the range of digits possible is a huge departure from counting sort used standalone as we saw in the earlier tutorial. This detail is why radix sort is great for sorting numbers even if the smallest and largest numbers are really far apart. With counting sort, our count array would be as large as the distance between the smallest and largest numbers. With radix sort, the distance for a base-10 number will always be 10. ALWAYS!

A Radix Sort Implementation in JavaScript

Taking everything we've seen so far, let's turn all of that into JavaScript so that we have an implementation of radix sort that we can use:

// Function to get the digit at a specific place in a number
function getDigit(num, place) {
  // Calculate the absolute value of the number
  const absNum = Math.abs(num);

  // Divide the number by 10 raised to the power of the place
  const digit = Math.floor(absNum / Math.pow(10, place)) % 10;

  // Return the digit
  return digit;
}

// Function to count the number of digits in a number
function digitCount(num) {
  // If the number is 0, return 1
  if (num === 0) return 1;

  // Otherwise, calculate the logarithm of the absolute value of the number
  // and add 1 to get the number of digits
  return Math.floor(Math.log10(Math.abs(num))) + 1;
}

// Function to find the maximum number of digits in an array of numbers
function mostDigits(nums) {
  // Initialize the maximum number of digits to 0
  let maxDigits = 0;

  // Iterate through the array of numbers
  for (const num of nums) {
    // Calculate the number of digits in the current number
    const numDigits = digitCount(num);

    // Update the maximum number of digits if the current number has more digits
    maxDigits = Math.max(maxDigits, numDigits);
  }

  // Return the maximum number of digits
  return maxDigits;
}

// Function to perform counting sort on an array of numbers based on a specific place
function countingSort(input, place) {
  // Create an output array to store the sorted numbers
  const output = new Array(input.length);

  // Create a count array to store the count of each digit (0-9)
  const count = new Array(10).fill(0);

  // Iterate through the array of numbers
  for (const num of input) {
    // Get the digit at the specified place in the current number
    const digit = getDigit(num, place);

    // Increment the count of the digit
    count[digit]++;
  }

  // Update the count array to contain the cumulative count of each digit
  for (let i = 1; i < 10; i++) {
    count[i] += count[i - 1];
  }

  // Iterate through the array of numbers in reverse order
  for (let i = input.length - 1; i >= 0; i--) {
    // Get the digit at the specified place in the current number
    const digit = getDigit(input[i], place);

    // Calculate the index of the current number in the output array
    const index = count[digit] - 1;

    // Store the current number in the output array at the calculated index
    output[index] = input[i];

    // Decrement the count of the digit
    count[digit]--;
  }

  // Return the output array
  return output;
}

// Function to perform radix sort on an array of numbers
function radixSort(input) {
  // Find the smallest element (to account for negative values)
  const min = Math.min(...input);

  // Shift all values in the input by the min value
  input = input.map(val => val - min);

  // Find the maximum element in the array
  const max = Math.max(...input);

  // Find the maximum number of digits in the array of numbers
  const maxLength = mostDigits(input);

  // Iterate through the digits from 0 to the maximum number of digits
  for (let digit = 0; digit < maxLength; digit++) {
    // Perform counting sort on the array of numbers based on the current digit
    input = countingSort(input, digit);
  }

  // Having accounted for the minimum value,
  // shift all values back
  input = input.map(val => val + min);

  // Return the sorted array of numbers
  return input;
}

To see this radix sort in action, we can test it out with the following input:

const numbers = [100, 43, 2310, 67, 512, 99, 104];
const sortedNumbers = radixSort(numbers);
console.log(sortedNumbers); // Output: [43, 67, 99, 100, 104, 512, 2310]

The unsorted values are the same as the ones we visually walked through our example on, so we can see how the final output matches what we saw earlier!

If we take a few steps back, radix sort looks really complicated. Looks can be deceiving here, for a bunch of the code is caught up in defining various helper functions like mostDigits, getDigit, digitCount, and our modified countingSort. The core radixSort function is much more concise:

// Function to perform radix sort on an array of numbers
function radixSort(input) {
  // Find the smallest element (to account for negative values)
  const min = Math.min(...input);

  // Shift all values in the input by the min value
  input = input.map(val => val - min);

  // Find the maximum element in the array
  const max = Math.max(...input);

  // Find the maximum number of digits in the array of numbers
  const maxLength = mostDigits(input);

  // Iterate through the digits from 0 to the maximum number of digits
  for (let digit = 0; digit < maxLength; digit++) {
    // Perform counting sort on the array of numbers based on the current digit
    input = countingSort(input, digit);
  }

  // Having accounted for the minimum value,
  // shift all values back
  input = input.map(val => val + min);

  // Return the sorted array of numbers
  return input;
}

What our radixSort function does is first calculate the largest number that it needs to sort and the number of digits that number contains. After that, the main work is to call countingSort on each digit in our input to sort them from the least significant digit all the way to the most significant digit. The bulk of the work is ironically done by our modified counting sort whose:

  1. count array always has a fixed size of 10
  2. the sorting is done on individual digits of a value instead of the whole value

You can see a full working example of Radix Sort by going to its page in theData Structures & Algorithms Github folder.

Our Implementation Works on Negative Numbers

Typically, radix sort requires ttat he numbers that are being sorted be positive. As we saw with our discussion on negative numbers and counting sort, the same approach used then to allow negative numbers is the same approach we applied to our radix sort implementation here.

Performance

Before we wrap it all up, let's talk about the performance of radix sort. The TL;DR is that it's pretty efficient and runs in linear time.

Suppose that's all you needed to know, congratulations! For most of us, we'll need to get a little bit deeper, so hold on to your seats for just a few more minutes.

What Influences Radix Sort Performance?

Radix sort's performance is influenced by three things. First, it is influenced by the size of our input which we'll identify as n:

Second, it is influenced by the number of digits in our largest number, and this is identified by d:

Third and lastly, it is influenced by the range of digits that are possible. For decimal (base-10) numbers, the range of numbers goes from 0 to 9 for a total of 10 possible digits:

We can represent this range of digits as k.

It's Big-O Notation Time

If we take n, d, and k together, the running time for radix sort can be represented as O(d ⋅ (n + k)). When we look at how radix sort operates, we can see why this is the running time representation:

  1. The number of items we have to sort (n) sets the bar for how long our other operations will take
  2. The sorting subroutine is counting sort (which runs in linear time as well), with its range being the digits (k) that exist in our base-10 world
  3. We call counting sort on all (n) items once for each digit in the largest value (d) we are trying to sort

Putting it all together, radix sort runs in linear time provided the values for n, d, and k are in the same very large ballpark. The following table provides the best, worst, and average case results:

Scenario Time Complexity Memory Complexity
Best case O(d ⋅ (n + k)) O(n + k)
Worst case O(d ⋅ (n + k)) O(n + k)
Average case O(d ⋅ (n + k)) O(n + k)

The time complexity does not change no matter what combination of inputs we throw in there.

For the memory complexity, Radix Sort requires additional space for the:

  1. Counting Array: To count the occurrences of each digit (size k)
  2. Output Array: To store the sorted numbers temporarily (size n)

That brings the total memory complexity to O(n + k). That's not too shabby at all.

Conclusion

And there you have it—a detailed dive into radix sort and how it works! In many ways, radix sort is a very efficient algorithm for sorting numbers:

Name Average Memory
Quicksort n log n log n (average)
Mergesort n log n n (worst case)
Heapsort n log n 1
Timsort n log n n
Bubble sort n2 1
Selection sort n2 1
Insertion sort n2 1
Counting sort O(n + k) O(n + k)
Radix sort O(d ⋅ (n + k)) O(n + k)

It even improves on its nearest frenemy, counting sort, greatly. With counting sort, if the range of numbers from our smallest number to the largest number is large, we'll find ourselves creating massive arrays to store all of our intermediate steps. Radix sort solves this problem where the largest array we would have, outside of the input itself, is the range of digits. For decimal base-10 values, that would be an array size of 10. Using the example of sorting 1 and 1,000,000 that we started this tutorial off with, radix sort would have a k value of 10 (base-10 integer) and a n value of 2. The d value would be 7 which corresponds to the number of digits in 1,000,000. At no point will any intermediate array be greater than 10 digits, and this intermediate array would be iterated on exactly d times or 7. This is why we classify radix sort as being an efficient sorting algorithm.

Just a final word before we wrap up. If you have a question and/or want to be part of a friendly, collaborative community of over 220k other developers like yourself, post on the forums for a quick response!

Kirupa's signature!

The KIRUPA Newsletter

Thought provoking content that lives at the intersection of design 🎨, development 🤖, and business 💰 - delivered weekly to over a bazillion subscribers!

SUBSCRIBE NOW

Creating engaging and entertaining content for designers and developers since 1998.

Follow:

Popular

Loose Ends

:: Copyright KIRUPA 2024 //--