/

How to Remove Duplicates in LibreOffice Calc List

150901 views
25

This quick guide explains how you can remove duplicates in LibreOffice calc list. And it’s super easy.

If you used Microsoft Excel, you might have already known that it provides an option in the ribbon to remove duplicates.

However, there is no direct option to remove duplicates in LibreOffice Calc. You have to use the advanced filter and a trick to achieve this. Here’s how.

I’m going to explain two different methods to do this. The first method uses a helper column to remove duplicates. And the second method is to hide the duplicates and filter the rest.

The second method is easy and recommended.

Method 1: Remove Duplicates via helper column

For example, you have the following set of data. I want to remove the duplicates from Column A. For better understanding, I have highlighted them in RED.

Sample data for duplicate explanation
Sample data for duplicate explanation

Follow the below steps.

  • Add a helper column – say column E. Put value 1 as the first value and drag it till the end of your data set. That means this helper column should fill up 1, 2, 3…. and so on. The column D is the helper column in this example.
Helper column added
Helper column added
  • Now, select the entire range (A1 to D10 in the above image) and Sort (Data > Sort) by the column where you want to find a duplicate. So, I have selected A1 to D10 and sorted by column A in this example.
Sort by column
Sort by column
  • After sorting, you should see that column A is sorted. Now add another helper column and add the following formula by skipping the first row. For this example, I have added Column E and the formula at E3, skipping E2. You need to change the formula based on your target column.
=IF(A3=A2,1,0)
  • Explanation: Since the column A is sorted, the same values should appear together! Hence, I am putting an IF function to find what are those same values. If they are same, let’s put 1 in the cell, otherwise 0.
  • So, after adding the helper column, it should look like this. You may notice the 1 values are the duplicate ones.
After adding the formula
After adding the formula
  • Now filter only 1 from column E. And you have your duplicate data. Now select the rows, right click and choose delete rows. This will delete the duplicate data.
Duplicate values
Duplicate values
  • Remove the filter from all the columns. And you have the cleaned data without duplicates.
  • Now, you might be thinking, what about the helper column 1, which we have added? Well, since you sorted the data, the actual order is not present. So, to go back to your original order of data, simply sort in ascending order on the helper column 1 (i.e. column D). Since it has chronological numbers, you should get back the original order data without duplicates!
After removing duplicates
After removing duplicates

That’s it. It’s easy, isn’t it? Checkout the video for the above method.

Video – Method 1

However, there is another way, see below.

Method 2: Remove Duplicates via autofilter (Recommended)

It’s easier to understand with an example. We have this list of items with the following data. The duplicate values are colour-coded as below.

data for removing duplicates
data for removing duplicates

To remove duplicates, select the list. And then, from the menu, click on Data -> More Filters -> Advanced Filters.

advanced filter option
advanced filter option

On the Advanced Filter window, select the range for removing the duplicates. It can be a single column or multiple columns of data of any size.

Then select the option "No duplicates".

no duplicate option
no duplicate option

Press OK.

after removing the duplicates
after removing the duplicates

And there you have it. Your data set without duplicates. Now you can copy the unique data to some other sheet or workbook for further analysis and processing.

Video – method 2

If you still have doubts, I have put up a quick video below. And don’t forget to subscribe to us on YouTube!

Common Errors and Usage Notes

  • The above method 2 shows you the unique data items by hiding duplicate rows. If you look at the above image, you can see rows 3 and 5. That means row 4 is hidden. Unfortunately, LibreOffice doesn’t have the functionality of the current version to delete everything duplicates and show you unique values in chronological order.
  • So, once you have the above list, you need to copy them to a different sheet or workbook for your further processing.
  • If you are receiving any of the below two errors, then make sure to select the range and the range is visible in the text box under ‘read the filter criteria from’.
This range does not contain a valid query error
This range does not contain a valid query error
Invalid sheet reference
Invalid sheet reference

Wrapping Up

As the LibreOffice Calc doesn’t provide a simple enough quick option, you have to follow this way to remove the duplicates. Do let me know which method worked out for you.

Join our Telegram channel and stay informed on the move.

Creator of libreofficehelp.com. The aim is to help as many people as possible with easy-to-understand tutorials. Connect with me via Telegram, Twitter, LinkedIn, or send us an email.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

25 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Johanna Suffern

I am doing this exact process but get the error “This range does not contain a valid query.” Can you help me resolve this error?

Arctia

Excel:
1. Select the range of cells
2. Click Data > Remove Duplicates.
3. Click OK.

And this shows exactly why open source is where it is…

Greg Ferraz

Do not Work!

My version is:

Version: 7.1.3.2 / LibreOffice Community
Build ID: 10(Build:2)
CPU threads: 8; OS: Linux 5.11; UI render: default; VCL: gtk3
Ubuntu package version: 1:7.1.3-0ubuntu0.21.04.1
Calc: threaded

But a simple spreadsheet, where I just want to remove duplicate rows, this filter tool doesn’t Work! When I perform the procedure, all records are gone. When I apply the filter again, the duplicate records come back.

Jeremy Hill

This doesn’t seem to actually remove the duplicate rows. It only reduces the height of the rows containing the duplicates to a very low value so it appears like they aren’t there. However if you expand the height back up the values are there, so if you are counting the data by formula the duplicates remain?

Re Du

How come you do not respond to the users that say these steps do not work? It appears there are many users that receive this message: “This range does not contain a valid query.”

son

It would be easier if you could explain what you mean exacly by “the list”. I have exactly 1 column with values in my spreadsheet and after choosing this column as a range it still return above message/error.

Saquib mansoor

Even after selecting the list and then trying to trace out the duplicate, there appears a dialogue box saying “This range does not contain a valid query.”

BGFuryk

(note: I did figure a way out that works, you can skip to the end to see how)

Hey, just tried doing this, and I noticed if I have 3 entries that are identical, this process will remove the ones where the ENTIRE row is identical, but not when the values of the column are identical.

example I have
A B
1 – bob $7
2 – bob
3 – bob

If I use this filter (i understand how filters work, not getting into that here) it will hide either row 2 or 3, but not both. If I run the filter a second time, no change.

filter is:
A = not empty
no duplicates

Some testing I did and results:

B = not empty
no duplicates

it hid all the rows that had b as empty, even rows that had a unique A column entry. So still not what I need.

——————————–

A = not empty
and
B = empty
no duplicates

hides all duplicates, but only shows rows where B is empty.

——————————–

A = not empty
or
B = empty
no duplicates

does the same as only having A = not empty (shows duplicates still, if the entire row is unique)

——————————–

using the Advanced filter I get success!
data –> more filters –> advanced filter
first click the range selector, then select the cells to be checked (only column A), you can include the title row at the top, as long as you know it is a unique value, then simply check the no duplicates box.

Success!

Erwin

And it still does not work with my LibreOffice Calc on Ubuntu …

Scruffyhammer

arindam,

Thank you for the help, video and pictures. Your writeup is very informative and this method, while not as easy as M$, gave me the results I needed.

Much appreciated.

Ricardo

Hello, I have followed the instructions and what it does is hide the rows but it does not delete them.
Can you tell me how to remove duplicate rows?

Gregor

It has to be semicolons not commas at the IF example =IF(A3=A2,1,0)

Gregor

Really? This is confusing! Indeed, I am using German locale where comma is the decimal point.

(…and my thoughts were like: “Why the heck are they publishing wrong tutorials, or did the syntax change with a recent Libre Office update but why?”)

PS: To also mention this, removing duplicates worked great. I didn’t need the helper column because the order of my data didn’t matter.

Last edited 11 months ago by Gregor
Keh

Has no effect on the list at all in Calc 7.4.12 Linux. Followed directions and all ells remained visible and unaffected through multiple trials.

Eamonn Doyle

You don’t need to copy and paste the results in method 2. Just select the copy results to checkbox and in the reference box click where you want the results to be shown, which can be another sheet.

Yes this is a bit more cumbersome than Excel but not overly so.

More generally the main disadvantage is not being able to removes duplicates on multiple columns easily. I suspect the helper columns would be possible however it would be prone to error in setting it up. It’s an extremely flexible feature in Excel. In fact when I’m working with data as a one off process and don’t need to do the same thing over and over again I almost never write code to do this and just dump the data in Excel and get it to remove the duplicates. It’s almost instantaneous and extremely flexible. It really is quite impressive. Credit where credit is due.

I use LO as I’m on Linux however I do resort to Excel Online when I need to remove data on multiple columns which is actually pretty often. If I have a single column of data then I use method 2 with copy results to. I never use this option without using copy results to as the workflow is just wrong in my mind with the hidden rows.