Below is the original post from Richard French at Axiom Systems Inc.
From: Richard S. French Sent: Wednesday, October 10, 2001 1:10 PM To: robelle-l Subject: [robelle-l] Suprtool - Deleting duplicates in detail dataset Using Suprtool, I inadvertently created duplicate records in my detail dataset. Is there an easier way to delete these duplicate records, other than what I have described below? Note - currently using Suprtool Version 4.3 Thanks. Open database Get detail dataset records Sort records Duplicate none keys Output flat file Delete Xeq Input flat file Put to detail dataset Xeq Richard S. French
Richard's code above deletes all the records from the dataset and then re-puts them, but without the duplicates. Glenn Cole suggested using Adager to erase the dataset, which would be faster. But Hans Hendriks from Robelle support came up with another way to do the entire job, only much faster:
Sent: Wednesday, October 10, 2001 5:35 PM
To: robelle-l
Subject: Re: Suprtool - Deleting duplicates in detail dataset
Richard,
You example deletes *all* records, then writes back *all* "originals". This
could be expensive if you have a large number of records, and only a few
duplicates.
If this is the case, I suggest you do the following:
Pass 1 - find duplicated records
get dataset
define longfield,1,80 (enough to make records unique)
sort longfield
duplicate only keys
output dupfile,link
xeq
Pass 2, Delete all records that have duplicates
get dataset
table duptab,longfield,sorted,dupfile
if $lookup(duptab,longfield)
delete
Output $null
xeq
Pass 3, Add back 1 of each previously duplicated record:
input dupfile
sort longfield
duplicate none keys {in case there were 3 or more of some....}
put dataset
xeq
/Hans
And Richard confirmed the beauty of the solution:
Subject: Re: Suprtool - Deleting duplicates in detail dataset To: robelle-l Thanks, Hans. That was just what I was looking for - as my detail dataset had 1.5 million records and I had roughly 20,000 duplicates. I just tested this and it worked great.!
This is exactly the way a discussion list is supposed to work.
|
|
|---|