Tutorials Books Videos Forums

Change the theme! Search!
Rambo ftw!

Customize Theme


Color

Background


Done

Removing Duplicate Arrays from an Array of Arrays

by kirupa   |    filed under Arrays

In the Useful Array Tricks article, one of the tricks described how to ensure the contents of our array are unique and no duplicate items exist. The technique described really only worked when the array items in question were primitives like text or numbers. In this article, we will go one step further and look at another common case. What if the array items with potential duplicates were themselves arrays? How will we both identify the duplicate arrays and also remove those duplicates to ensure we have an array made up only of unique arrays? In this article, we’ll find out how!

Onwards!

Kirupa's book on Arrays

The Perfect Book to Learn About Arrays

To kick your array skills into the stratosphere, everything you need to be an arrays expert is available in this book.

BUY ON AMAZON

The Problem Visualized

The word array came up a whole bunch of times in our intro, and I’m pretty sure not all uses of that word made sense. At least, it certainly didn’t make sense to me! To better understand what we are trying to do, let’s take a step back and look at the problem in greater detail. Meet bigArray, the star of this article:

let bigArray = [["a", "b", "c"], 
                [1, 2, 3], 
                [true, true, false], 
                [":)", ":P", ":X"],
                [true, false, false],
                [1, 2, 3],
                ["foo", "zorb", "blarg"],
                ["a", "b", "c"]];

Our bigArray is a two-dimensional array where we have an array whose contents are also arrays. If we had to visualize this array, we would see something that looks as follows:

The big thing to note is that some of the arrays in our bigArray aren’t unique. They are...repeated:

What we want to do is come up with a mechanism for removing these duplicate arrays so that our bigArray’s contents are unique:

So...that is the problem we are trying to solve. In the next section, we are going to make a big jump and look at how we will go about solving this problem.

Our Approach

There are a few approaches we can take to remove the duplicate content arrays from our parent array. While we can take a more brute-force approach and compare each array’s contents with every other array’s contents to identify duplicates, we can take another approach and rely on the Set object and its natural ability to filter out duplicate values.

Sets 101

If you aren’t familiar with sets and their quirks, I recommend you read the Diving into Sets article first. Otherwise, some of the things we’ll be looking at may seem a bit strange.


On the surface, what we are trying to do with sets seems like it has a simple solution. When creating a new set, we can pass in a collection of data (like our arrays!) and rely on our Set object's default behavior where only the unique values from that collection are stored:

let bigArray = [["a", "b", "c"],
                [1, 2, 3],
                [true, true, false],
                [":)", ":P", ":X"],
                [true, false, false],
                [1, 2, 3],
                ["foo", "zorb", "blarg"],
                ["a", "b", "c"]];
                
let uniqueArray = new Set(bigArray);
console.log(uniqueArray);

As it turns out, this won’t work. One set-related quirk that is highlighted in the Diving into Sets article is that sets do their filtering magic only on primitive values and object references. Our bigArray contents are neither of those two things. They are straight-up objects, so the default set filtering behavior won’t work:

Is that really a problem? Probably...but not for us! We have a way of tricking our set into thinking our array objects are worth filtering, and the way we do that is by turning our array contents into a string:

By turning our arrays into strings, we are now dealing with a primitive value that is part of a set’s natural diet. Our set is no longer allergic to filtering out duplicate values:

Once our duplicate values have been removed, we can unstringify our content and turn our strings back into arrays:

At this point, this gets us to the destination we wanted to get to from the very beginning. We started off with an array whose contents are arrays with some duplicate content. We ended up with an array whose contents are still arrays, but these arrays are unique!

The Code

We just got the hard parts out of the way. Now that we have a better idea of the problem we are trying to solve and a general approach for solving it, it is time to turn all of those words and pictures into code!

Turning our Arrays into Strings

The first thing we’ll do is convert our current array contents into strings. Add the following highlighted lines just after our bigArray declaration:

let bigArray = [["a", "b", "c"],
                [1, 2, 3],
                [true, true, false],
                [":)", ":P", ":X"],
                [true, false, false],
                [1, 2, 3],
                ["foo", "zorb", "blarg"],
                ["a", "b", "c"]];
                
let stringArray = bigArray.map(JSON.stringify);
console.log(stringArray);

The way we do this conversion is by using the map function. The Mapping, Filtering, and Reducing Things article goes into more detail on how map works, but the elevator pitch version is that map goes through each item in our array and calls a function on each item. The function we are calling is JSON.stringify, which...stringifies data. Once our map function has fully run, what gets stored by stringArray is our existing array data turned into string form:

Notice that this string conversion is fairly direct. Each array item is wrapped in quotation marks to designate it as a string, and some of the contents of each array are additionally processed into strings as well with escape characters and other details thrown in. The logic behind how exactly to stringify and what items to leave as-is is part of the JSON.stringify function’s internals. That logic is something we don’t have a direct hand in defining, but the default behavior is good enough.

Creating the Set

With our array of strings ready to go, it’s time to create our set and filter out duplicate values. Add the following two lines to what you have already:

let uniqueStringArray = new Set(stringArray);
console.log(uniqueStringArray);

With these lines, we are creating a new Set object called uniqueStringArray and pass in our earlier stringArray as part of creating our set. The result of this line running is that our uniqueStringArray stores only the unique values from our stringArray collection:

Notice that our collection went from having eight items to just six after the two duplicate items were filtered out!

Going Back to Arrayville!

The last step is for us to go from an array of strings back to an array of arrays. There are a few steps involved here. Our first step will be to use Array.from and turn out set into an array. Add these two lines to make that happen:

let uniqueArray = Array.from(uniqueStringArray);
console.log(uniqueArray);

When we do this, our set containing string values will magically turn into an array containing string values. It all gets stored as part of the uniqueArray object:

We are almost done here. The last step is to turn these string values back into their non-string equivalents. We could use map again and provide the JSON.parse function (the opposite of JSON.stringify) to turn our stringified arrays (and content) back into regular arrays. There is an easier way. The Array.from method takes a second argument, and this is a function that gets called on each array item. Let’s just use that! We can modify our uniqueArray code to look as follows:

let uniqueArray = Array.from(uniqueStringArray, JSON.parse);
console.log(uniqueArray);

When this code runs, we are back to having an array of arrays since JSON.parse unstringifies each array item back into its original array form. The important change from our starting point is that all duplicate values have been removed.

Conclusion

The full code from earlier with some of the console.log statements removed is as follows:

let bigArray = [["a", "b", "c"],
                [1, 2, 3],
                [true, true, false],
                [":)", ":P", ":X"],
                [true, false, false],
                [1, 2, 3],
                ["foo", "zorb", "blarg"],
                ["a", "b", "c"]];
                
let stringArray = bigArray.map(JSON.stringify);
let uniqueStringArray = new Set(stringArray);
let uniqueArray = Array.from(uniqueStringArray, JSON.parse);

console.log(uniqueArray);

If you want to go all compact (and potentially impair code readability), all of these statements can be put into just one line:

let bigArray = [["a", "b", "c"],
                [1, 2, 3],
                [true, true, false],
                [":)", ":P", ":X"],
                [true, false, false],
                [1, 2, 3],
                ["foo", "zorb", "blarg"],
                ["a", "b", "c"]];
                
let uniqueArray = Array.from(new Set(bigArray.map(JSON.stringify)), JSON.parse);
console.log(uniqueArray);

To reiterate what we saw earlier, there are other approaches we can take for accomplishing a similar end result. Most of those approaches will be more manual with us replicating a lot of the functionality provided by Set, Array.from, JSON.stringify, and JSON.parse. There is nothing wrong with us defining all of the logic for how to filter out duplicate array values manually, but the potential impact is performance. The various JavaScript engines continuously optimize the internals of built-in objects like our Set, JSON, or Array objects. If we replicated some of this functionality ourselves, there is a chance our code may miss out on some of these optimizations. Just something to keep in mind!

Just a final word before we wrap up. If you have a question and/or want to be part of a friendly, collaborative community of over 220k other developers like yourself, post on the forums for a quick response!

Kirupa's signature!

The KIRUPA Newsletter

Thought provoking content that lives at the intersection of design 🎨, development 🤖, and business 💰 - delivered weekly to over a bazillion subscribers!

SUBSCRIBE NOW

Creating engaging and entertaining content for designers and developers since 1998.

Follow:

Popular

Loose Ends

:: Copyright KIRUPA 2024 //--