 |
|
Advanced Oracle SQL: NLS Functions
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
NLS_UPPER, NLS_LOWER
and NLS_INITCAP use the setting NLS_SORT to define what is the uppercase of a
specific character.
SELECT
NLS_UPPER
(
'Ich wei?es nicht',
'NLS_SORT=XSWISS'
),
NLS_INITCAP
(
'??hant',
'NLS_SORT=XFRENCH'
)
FROM
DUAL;
NLS_UPPER('ICHWEI?NLS_INIT
------------------ --------
ICH WEISZ ES NICHT ??hant
The locale uppercase of a Swiss-German sentence and initial
capitalization of a French word are returned. It is possible to modify an
existing linguistic sort or add a new one using the Oracle Locale Builder
graphical tool:
SELECT
*
FROM
TABLE
(
SYS.ODCIVARCHAR2LIST
(
'priv?,
'priver',
'privation'
)
)
ORDER BY
NLSSORT(COLUMN_VALUE,'NLS_SORT=Xfrench');
COLUMN_VALUE
------------
privation
priv?br>
priver
In French, the sort is accent insensitive:
CREATE INDEX
ENAME_CI
ON
EMP(NLSSORT(ENAME,'NLS_SORT=BINARY_CI'));
ALTER SESSION
SET
NLS_COMP=LINGUISTIC
NLS_SORT=BINARY_CI;
SELECT
EMPNO,
ENAME
FROM
EMP
WHERE
ENAME='Scott';
EMPNO ENAME
---------- ----------
7788 SCOTT
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 37 | 2 (0)|
00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 37 | 2 (0)|
00:00:01 |
|* 2 | INDEX RANGE SCAN | ENAME_CI | 1 | | 1 (0)|
00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("ENAME",'nls_sort=''BINARY_CI''')=HEXTORAW('73636F747400'))
The comparison is done linguistically and the sort is binary
case insensitive (10g). The function-based index is scanned to find the
appropriate rows that match the sorting pattern.
National character set
Unicode and multi-byte functions are available to do
conversions and use different length semantics. As INSTR, LENGTH and SUBSTR work
best with single-byte character sets, a multi-byte character may use bytes or
char semantics.
SELECT
LENGTHB(UNISTR('X')) LENGTHB,
LENGTHC(UNISTR('X')) LENGTHC
FROM
DUAL;
LENGTHB LENGTHC
---------- ----------
2 1
UNISTR returns a Unicode string and is the inverse function
of ASCIISTR. LENGTHB returns the length in bytes, where LENGTHC returns the
length in characters. LENGTH2 and LENGTH4 are for fixed width character sets.
LENGTH, INSTR and SUBSTR also have Unicode variations INSTRB, INSTRC, INSTR2,
INSTR4 and SUBSTRB, SUBSTRC, SUBSTR2 and SUBSTR4.
SELECT
CONVERT(UPPER('Chrys??hantin'),'US7ASCII')
FROM
DUAL;
CONVERT(UPPER('
---------------
CHRYSELEPHANTIN
The string is converted to a non-accentuated string.
NCHR is similar to CHR, but it returns a national character
(NCHAR). TO_CHAR and TO_NCHAR convert characters to and from the national
character set. TO_NCLOB converts LOB or strings to Unicode large objects.
Logical functions with true or false values
There is no support for the Boolean datatype in Oracle;
nevertheless, Oracle implements a few functions to perform logical operations.
SELECT
ENAME,
SAL,
DECODE
(
GREATEST(SAL,2800),
SAL,
DECODE
(
JOB,
'PRESIDENT',
TO_NUMBER(NULL),
1
)
) DECODE,
CASE
WHEN SAL>=2800 AND JOB!='PRESIDENT'
THEN 1
END CASE
FROM
EMP;
ENAME
SAL DECODE CASE
---------- ---------- ---------- ----------
SMITH 800
ALLEN 1600
WARD 1250
JONES 2975 1 1
MARTIN 1250
BLAKE 2850 1 1
CLARK 2450
SCOTT 3000 1 1
KING 5000
TURNER 1500
ADAMS 1100
JAMES 950
FORD 3000 1 1
MILLER 1300
DECODE is checking for equivalency between the first and the
second argument. GREATEST returns the greatest of its arguments and LEAST the
smallest.
CASE is more powerful as it does check for any kind of
comparison. As in the query above, CASE appears to be more readable. CASE is
more of a SQL expression than a SQL function.
Nevertheless, CASE has an additional interesting property
regarding indexing. The CASE SQL expression could be used in a function-based
index; therefore, any condition could be indexed.
CREATE INDEX
CLERK_OR_LOWSAL
ON
EMP
(
CASE WHEN JOB='CLERK' OR SAL<1260 THEN 1 END
);
SELECT
ENAME,
JOB,
SAL
FROM
EMP
WHERE
CASE WHEN JOB='CLERK' OR SAL<1260 THEN 1 END = 1;
ENAME
JOB SAL
---------- --------- ----------
SMITH CLERK 800
WARD SALESMAN 1250
MARTIN SALESMAN 1250
ADAMS CLERK 1100
JAMES CLERK 950
MILLER CLERK 1300
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
--------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT | | 5 | 100 | 2 (0)|
00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 5 | 100 | 2
(0)|
00:00:01 |
|* 2 | INDEX RANGE SCAN | CLERK_OR_LOWSAL | 5 | | 1
(0)|
00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access(CASE WHEN ("JOB"='CLERK'
OR "SAL"<1260) THEN 1 END =1)
The clerks and employees with a salary lower than 1260 are
returned. The condition is integrated in a case statement and indexed.