LISTAGG DISTINCT in Oracle 19c
Oracle Database 19c introduces a significant enhancement to the LISTAGG function by supporting the DISTINCT keyword directly within the aggregation clause. This simplifies query logic and improves readability when deduplicating values in aggregated string lists.
Problem Statement
The traditional LISTAGG function aggregates values into a delimited string but does not inherently remove duplicates, potentially leading to data inflation in reports and dashboards.
INSERT INTO emp VALUES (9998,'MILLER','ANALYST',7782,to_date('23-1-1982','dd-mm-yyyy'),1600,NULL,10);
INSERT INTO emp VALUES (9999,'MILLER','MANADER',7782,to_date('23-1-1982','dd-mm-yyyy'),1500,NULL,10);
COMMIT;
SQL> SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM emp GROUP BY deptno;
DEPTNO EMPLOYEES
---------- ----------------------------------------
10 CLARK,KING,MILLER,MILLER,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
SQL>
Issue: Repeated names such as “MILLER” appear multiple times if they exist as distinct rows in the source dataset.
Pre-19c Workarounds
Prior to 19c, developers employed various techniques to achieve distinct aggregation, such as:
ROW_NUMBER Filtering
SELECT e2.deptno, LISTAGG(e2.ename, ',') WITHIN GROUP (ORDER BY e2.ename) AS employees
FROM (SELECT e.*,
ROW_NUMBER() OVER (PARTITION BY e.deptno, e.ename ORDER BY e.empno) AS myrank
FROM emp e) e2
WHERE e2.myrank = 1
GROUP BY e2.deptno
ORDER BY e2.deptno;
DEPTNO EMPLOYEES
---------- ----------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
SQL>
Inline View with DISTINCT
SELECT e2.deptno, LISTAGG(e2.ename, ',') WITHIN GROUP (ORDER BY e2.ename) AS employees
FROM (SELECT DISTINCT e.deptno, e.ename
FROM emp e) e2
GROUP BY e2.deptno
ORDER BY e2.deptno;
DEPTNO EMPLOYEES
---------- ----------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
SQL>
These solutions, while effective, introduce additional complexity and may incur performance overhead depending on dataset size and indexing.
19c Enhancement: Native DISTINCT in LISTAGG
Oracle 19c streamlines this process with native syntax support:
SQL> SELECT deptno, LISTAGG(DISTINCT ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM emp
GROUP BY deptno
ORDER BY deptno;
DEPTNO EMPLOYEES
---------- ----------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
SQL> SELECT deptno, LISTAGG(ALL ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM emp
GROUP BY deptno
ORDER BY deptno;
DEPTNO EMPLOYEES
---------- ----------------------------------------
10 CLARK,KING,MILLER,MILLER,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
SQL>
Strategic Implications for Developers and Architects
From a solution architecture perspective, adopting LISTAGG DISTINCT in Oracle 19c:
- Enhances code clarity and auditability.
- Minimizes technical debt in legacy codebases.
- Aligns with modern SQL standards for expressive query formulation.
This minor syntactic upgrade yields major benefits in terms of operational efficiency and long-term maintainability, especially in complex data marts, ETL pipelines, and reporting layers.
No Comments