Skip to content Skip to sidebar Skip to footer

Sql: How To Get All The Distinct Characters In A Column, Across All Rows

Is there an elegant way in SQL Server to find all the distinct characters in a single varchar(50) column, across all rows? Bonus points if it can be done without cursors :) For exa

Solution 1:

Here's a query that returns each character as a separate row, along with the number of occurrences. Assuming your table is called 'Products'

WITH ProductChars(aChar, remain) AS (
   SELECTLEFT(productName,1), RIGHT(productName, LEN(productName)-1) 
      FROM Products WHERE LEN(productName)>0UNIONALLSELECTLEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
      WHERE LEN(remain)>0
)
SELECT aChar, COUNT(*) FROM ProductChars
GROUPBY aChar

To combine them all to a single row, (as stated in the question), change the final SELECT to

SELECT aChar AS [text()] FROM
  (SELECTDISTINCT aChar FROM ProductChars) base
FOR XML PATH('')

The above uses a nice hack I found here, which emulates the GROUP_CONCAT from MySQL.

The first level of recursion is unrolled so that the query doesn't return empty strings in the output.

Solution 2:

Use this (shall work on any CTE-capable RDBMS):

select x.v into prod from (values('product1'),('widget2'),('nicknack3')) asx(v);

Test Query:

with a as 
(select v, ''as x, 0as n from prod 
    union all
    select v, substring(v,n+1,1) as x, n+1 as n from a where n < len(v)
)
select v, x, n from a -- where n > 0
order by v, n
option (maxrecursion 0)

Final Query:

with a as 
(
    select v, '' as x, 0 as n from prod 
    union all
    select v, substring(v,n+1,1) as x, n+1as n from a where n < len(v)
)
selectdistinct x from a where n > 0orderby x
option (maxrecursion 0)

Oracle version:

witha(v,x,n) as 
(select v, ''as x, 0as n from prod 
    union all
    select v, substr(v,n+1,1) as x, n+1 as n from a where n < length(v)
)
select distinct x from a where n > 0

Solution 3:

Given that your column is varchar, it means it can only store characters from codes 0 to 255, on whatever code page you have. If you only use the 32-128 ASCII code range, then you can simply see if you have any of the characters 32-128, one by one. The following query does that, looking in sys.objects.name:

with cteDigits as (
    select0as Number
    unionallselect1as Number
    unionallselect2as Number
    unionallselect3as Number
    unionallselect4as Number
    unionallselect5as Number
    unionallselect6as Number
    unionallselect7as Number
    unionallselect8as Number
    unionallselect9as Number)
, cteNumbers as (
    select U.Number + T.Number*10+ H.Number*100as Number
    from cteDigits U
    crossjoin cteDigits T
    crossjoin cteDigits H)
, cteChars as (
    selectCHAR(Number) asCharfrom cteNumbers 
    where Number between32and128)
select cteChars.Char as [*]
from cteChars
cross apply (
    select top(1) *from sys.objects
    where CHARINDEX(cteChars.Char, name, 0) >0) as o
for xml path('');

Solution 4:

If you have a Numbers or Tally table which contains a sequential list of integers you can do something like:

SelectDistinct''+Substring(Products.ProductName, N.Value, 1)
From dbo.Numbers As N
    CrossJoin dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')

If you are using SQL Server 2005 and beyond, you can generate your Numbers table on the fly using a CTE:

With Numbers As
    (
    Select Row_Number() Over ( OrderBy c1.object_id ) As Value
    From sys.columns As c1
        Cross Join sys.columns As c2
    )
SelectDistinct'' + Substring(Products.ProductName, N.Value, 1)From Numbers As N
    Cross Join dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')

Solution 5:

Building on mdma's answer, this version gives you a single string, but decodes some of the changes that FOR XML will make, like & -> &amp;.

WITH ProductChars(aChar, remain) AS (
   SELECTLEFT(productName,1), RIGHT(productName, LEN(productName)-1) 
      FROM Products WHERE LEN(productName)>0UNIONALLSELECTLEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
      WHERE LEN(remain)>0
)
SELECT STUFF((
  SELECT N''+ aChar AS [text()]
  FROM (SELECTDISTINCT aChar FROM Chars) base
  ORDERBY aChar
FOR XML PATH, TYPE).value(N'.[1]', N'nvarchar(max)'),1, 1, N'')
-- Allow for a lot of recursion. Set to 0 for infinite recursion
OPTION (MAXRECURSION 365)

Post a Comment for "Sql: How To Get All The Distinct Characters In A Column, Across All Rows"