Home » SQL & PL/SQL » SQL & PL/SQL » Select duplicate Folders in a table....
Select duplicate Folders in a table.... [message #195553] Fri, 29 September 2006 07:55 Go to next message
pw1975
Messages: 24
Registered: June 2006
Junior Member
Hellow Ora-gurus Wink


i have the following problem:

a table with folders, filesizes (of all files and subfolders), filecount (of all files and subfolders)

FOLDER | FILESIZE | FILECOUNT


I need to get all folders, which are duplicates by filesize and filecount (not by name!)


I made a query which retrieves all duplicates, BUT:

this query also receives the following result, which is NOT wanted:

\\folder\subfolder (123 MB, 45 files)

\\folder\subfolder\sub2 (123 MB, 45 files)

this happens, when subfolder has no files in it, just the Subfolder "Sub2". But i don't need the \\folder\subfolder in the result. So in this occasion, there really is no "real" subfolder.

allright would be
\\folder\subfolder\sub2 (123 MB, 45 files)

\\folder\subfolderAAA\sub3 (123 MB, 45 files)

or
\\folder\subfolderBBB\subfolderXYZ\sub5 (123 MB, 45 files)

\\folder\subfolderAAA\sub2 (123 MB, 45 files)

...


I hope i could explain this understandable Wink
Thanks in advance...

Cheers,
--ph

[Updated on: Fri, 29 September 2006 08:03]

Report message to a moderator

Re: Select duplicate Folders in a table.... [message #195559 is a reply to message #195553] Fri, 29 September 2006 08:09 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Well, I seem to be here first, so I'll say it.

Can we see the query, a create table script and some data?
Re: Select duplicate Folders in a table.... [message #195563 is a reply to message #195553] Fri, 29 September 2006 08:17 Go to previous messageGo to next message
pw1975
Messages: 24
Registered: June 2006
Junior Member
sure:

Table:
CREATE TABLE t_tab (
FOLDER varchar2(4000)
, FILESIZE_AGGR number(22,3)
, FILECOUNT_AGGR integer
);


INSERT INTO t_tab VALUES ('\\folder\subfolder','123','45');
INSERT INTO t_tab VALUES ('\\folder2\subfolder','123','45');
INSERT INTO t_tab VALUES ('\\folder\subfolder\abc','113','45');
INSERT INTO t_tab VALUES ('\\folder\subfolder2\abc','113','45');
INSERT INTO t_tab VALUES ('\\folder\subfolder2','123','45');


the result should be:

same:
\\folder\subfolder
\\folder2\subfolder
\\folder\subfolder2

same:
\\folder\subfolder\abc
\\folder\subfolder2\abc


sql:

SELECT
FOLDER
, FILESIZE_AGGR
, FILECOUNT_AGGR
FROM t_tab
WHERE (FILECOUNT_AGGR, FILESIZE_AGGR) IN
(SELECT FILECOUNT_AGGR, FILESIZE_AGGR
FROM
(
SELECT
FILESIZE_AGGR
, FILECOUNT_AGGR
FROM
t_tab
WHERE
FILENAME IS NULL
)
HAVING COUNT(FILECOUNT_AGGR) > 1
GROUP BY FILESIZE_AGGR
, FILECOUNT_AGGR
)
Re: Select duplicate Folders in a table.... [message #195565 is a reply to message #195563] Fri, 29 September 2006 08:26 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Well, the filename column isn't in the table, but after fixinf that, the output I get is:

FOLDER                  FILESIZE_AGGR FILECOUNT_AGGR
\\folder\subfolder\abc  113           45
\\folder\subfolder2\abc 113           45
\\folder\subfolder      123           45
\\folder\subfolder2     123           45
\\folder2\subfolder     123           45


This looks a lot like the output you were wanting.
What is the problem?
Re: Select duplicate Folders in a table.... [message #195566 is a reply to message #195565] Fri, 29 September 2006 08:30 Go to previous messageGo to next message
pw1975
Messages: 24
Registered: June 2006
Junior Member
Hi,

ok, then the data i posted does not show the problematic.

Please TRUNC the table and insert:

INSERT INTO T_TAB VALUES ('\\abc\def','123','45');
INSERT INTO T_TAB VALUES ('\\abc\def\ghi','123','45');


These to folders are not duplicate, just \\abc\def has no file in it an therefore it "looks" the same as \\abc\def\ghi.
But it is no duplicate...


INSERT INTO T_TAB VALUES ('\\abc\xyz','123','45');
would be a duplicate to: \\abc\def and \\abc\def\ghi, but it should only be a duplicate to the lowest Dirlevel, here "\\abc\def".




Re: Select duplicate Folders in a table.... [message #195572 is a reply to message #195566] Fri, 29 September 2006 08:51 Go to previous message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Ok, so assuming that what you want is a way to remove all the rows where the field FOLDER is the leading part of the FOLDER of one or more other rows which share the same values of FILECOUNT_AGGR and FILESIZE_AGGR, then this should do you:

SQL> CREATE TABLE t_tab 
  2  (FOLDER         VARCHAR2(50)
  3  ,FILESIZE_AGGR  NUMBER(22,3)
  4  ,FILECOUNT_AGGR INTEGER);

Table created.

SQL> INSERT INTO t_tab VALUES ('\\folder\subfolder','123','45');
SQL> INSERT INTO t_tab VALUES ('\\folder2\subfolder','123','45');
SQL> INSERT INTO t_tab VALUES ('\\folder\subfolder\abc','113','45');
SQL> INSERT INTO t_tab VALUES ('\\folder\subfolder2\abc','113','45');
SQL> INSERT INTO t_tab VALUES ('\\folder\subfolder2','123','45');
SQL> INSERT INTO T_TAB VALUES ('\\abc\def','123','45');
SQL> INSERT INTO T_TAB VALUES ('\\abc\def\ghi','123','45');

SQL> SELECT t1.folder,t1.filesize_aggr,t1.filecount_aggr
  2  FROM   t_tab t1
  3  WHERE  NOT EXISTS (SELECT NULL
  4                     FROM   t_tab t2
  5                     WHERE  t1.filesize_aggr = t2.filesize_aggr
  6                     AND    t1.filecount_aggr= t2.filecount_aggr
  7                     AND    SUBSTR(t2.folder,1,LENGTH(t1.folder)) = t1.folder
  8                     AND    t1.ROWID != t2.ROWID);

FOLDER                                             FILESIZE_AGGR FILECOUNT_AGGR
-------------------------------------------------- ------------- --------------
\\folder2\subfolder                                          123             45
\\folder\subfolder\abc                                       113             45
\\folder\subfolder2\abc                                      113             45
\\folder\subfolder2                                          123             45
\\abc\def\ghi                                                123             45

SQL> 
Previous Topic: PLS-00306: wrong number or types of arguments in call
Next Topic: input truncated to 13 characters (CR??)
Goto Forum:
  


Current Time: Sat Dec 03 11:44:07 CST 2016

Total time taken to generate the page: 0.11553 seconds