Skip to content Skip to sidebar Skip to footer

Find All Matches In A Varchar2()

I've a column that is a varchar2 that contains data like: ....[A1]...[A2]... I want to take out all data that is encapsulated by [] that starts with an alphabetic letter and can ha

Solution 1:

Great question! Here's a Fiddle showing how to query the matches into a result set.

And here's the long explanation in case the query in the Fiddle doesn't make sense :)

I'm using a table named RegEx_Test with a column MyVal. Here's the table's contents:

MyVal
------------------------------
[A1][abc][B23][D123]a33[bx5][Z15][ax0][B0][F13]R3
[X215][A3A][J99]F33F33G24[43][R3][Z99][c1][F3][d33]3x24[Y3][f13][9a][D41][Q39][XX12]B27[T03][J12]

Your regexp throughout is this: \[[[:alpha:]][[:digit:]]{1,2}\]. It's the same as in the other answer except with the POSIX :alpha: and :digit: indicators, which are safer in case of international charactersets.

First, you need to know the maximum number of matches on any line. Use REGEXP_COUNT for this:

SELECTMAX(REGEXP_COUNT(MyVal, '\[[[:alpha:]][[:digit:]]{1,2}\]'))
  FROM Regex_Test

MAX(REGEXP_COUNT(My...
----------------------6

Use that maximum count to get a "counter" table (that's the SELECT ... FROM DUAL below) and cross-join the counter table with a query that will pull your values using REGEXP_SUBSTR. REGEXP_SUBSTR has an "occurrence" parameter and that will use the Counter:

SELECT
  MyVal,
  Counter,
  REGEXP_SUBSTR(MyVal, '\[[[:alpha:]][[:digit:]]{1,2}\]', 1, Counter) Matched
FROM Regex_Test
CROSSJOIN (
   SELECT LEVEL Counter
   FROM DUAL
   CONNECTBY LEVEL <= (
     SELECTMAX(REGEXP_COUNT(MyVal, '\[[[:alpha:]][[:digit:]]{1,2}\]'))
     FROM Regex_Test)) Counters

Here's a sample run against my table (partial results):

MyVal                              Counter Matched
---------------------------------- ------- -------
[9a][D41][Q39][XX12]B27[T03][J12]1[D41][9a][D41][Q39][XX12]B27[T03][J12]2[Q39][9a][D41][Q39][XX12]B27[T03][J12]3[T03][9a][D41][Q39][XX12]B27[T03][J12]4[J12][9a][D41][Q39][XX12]B27[T03][J12]5[9a][D41][Q39][XX12]B27[T03][J12]6[A1][abc][B23][D123]a33[bx5]1[A1][A1][abc][B23][D123]a33[bx5]2[B23][A1][abc][B23][D123]a33[bx5]3
... and so on - total is 30 rows

At this point you have a result set of individual matches, plus nulls where a row had less than the maximum matches. The matches still have their surrounding brackets. Surround the whole thing with an outer query that will filter out the nulls and remove the brackets, and you have your final list:

SELECT SUBSTR(Matched, 2, LENGTH(Matched)-2) FROM (
  SELECT
    MyVal,
    Counter,
    REGEXP_SUBSTR(MyVal, '\[[[:alpha:]][[:digit:]]{1,2}\]', 1, Counter) Matched
  FROM Regex_Test
  CROSSJOIN (
     SELECT LEVEL Counter
     FROM DUAL
     CONNECTBY LEVEL <= (
       SELECTMAX(REGEXP_COUNT(MyVal, '\[[[:alpha:]][[:digit:]]{1,2}\]'))
       FROM Regex_Test)) Counters
) WHERE Matched ISNOTNULL

This is the query that's on the Fiddle, and it can be used in another query.

Solution 2:

You could use REGEXP_LIKE to find values matching your regular expression:

[...] WHERE REGEXP(col_name, '\[[A-Z][0-9]{1,2}\]');

Post a Comment for "Find All Matches In A Varchar2()"