Excel Clean-Up?

Ron Eggers

SPFG, Supreme Picture Framing God
Forum Donor
Joined
Jul 6, 2001
Posts
16,932
Loc
Wisconsin
Your mission, should you decide to accept it, is to help me delete duplicate entries in a very large Excel file (which is pretending to be a database.)

As always, The Grumble will disavow all knowledge of your activities if you are captured.

(This message will self-destruct in a year-or-two.)
 
You’re lucky we’re having a guest preacher today.

Okay, send 'er over. It should be easy (he said, naively) if one does a multiple sort.

What would you like to do, however, if there are different phone numbers at the same address? Keep them both or take a guess and vaporize one of them? Or just flag them both so you can make the decision?
 
There is one unique column/field.

Here's what I did, and it was fairly painless.

I sorted the entire file by that unique field. Then I made a new column/field with a formula like =if(b2=b1, "dupe","") with column B being the unique field.

Since I can't sort on a conditional column, I copied that entire column, then "special-pasted" the values (which were either blank or "dupe") into another new column and sorted by that one.

Then all the duplicate records were at one end of the file and I could highlight and delete them pretty easily. Then I deleted the two new columns.

Since I'll only have to do this rarely, at most, it wasn't too bad, but I'd be happy to hear about simpler procedures.
 
Flagging those duplicate columns is, I think, the easiest way to go … except instead of “dupe”, I’d use a “?” mark.


Nuts, that started out to be a check mark! How do make hidden "Alt/Opt" characters for Windozes based machines?
 
If you have duplicate rows where all the values are duplicated, there is an advanced filter that will cut out the duplicates. Here's the link to instructions:
Duplicate Rows in Excel
It sounds like you had other columns that may not have been duplicated so this wouldn't work in that case.

Otherwise, the ways given would be best but I would also filter on the value you chose for duplicates and then delete the filtered rows. No sorting needed.

Access is the best option though since you could save the query for future use.
 
I second Lisa's final opinion. Convert it to Access and you can sort, edit, find dupes, create lists by specific parameters, and do anything you like all in one place. Moreover, save them for future use on the same list or different list.

Why a spreadsheet for data management?
 
Back
Top