Introduced in Oracle 10g, the UTL_COMPRESS package can be used to
compress and uncompress large objects (raw, blob or bfile). For all
practical purposes, think of UTL_COMPRESS as PL/SQL’s means of
compressing or zipping files. When looking at the various procedures
and functions, many are prefixed with LZ. The algorithm used to
perform the compression is based on the Lempel-Ziv algorithm and
when used, the code is based on the LZW or Lempel-Ziv-Welch
implementation.
Oracle’s documentation for this package is sparse in terms of showing
a full-scale example. Sparse is not even the correct word here.
Nonexistent is a better descriptor. A MOSC note (249974.1) shows
two examples and various pages on the Internet can be searched.
The actual implementation of this package is pretty easy to use once
an example is seen. Many of the examples shown elsewhere include a
length comparison among the input, the compressed, and the
uncompressed lengths. Ideally, the input and uncompressed lengths
should be the same. You may find that the compressed length is longer
than the input length. This occurs when the input length is small or
short. The overhead of building a dictionary can make the compressed
length longer than the input length.
In this simple example, create input, compress it,
uncompress it and evaluate the lengths of each.
SET SERVEROUTPUT ON
DECLARE
l_in_blob
BLOB;
l_compressed_blob
BLOB;
l_uncompressed_blob
BLOB;
BEGIN
-- Set some values
l_in_blob
:= TO_BLOB(UTL_RAW.CAST_TO_RAW
('This is a long string of words used for this
example'));
l_compressed_blob
:= TO_BLOB('0');
l_uncompressed_blob := TO_BLOB('0');
-- Compress the string
UTL_COMPRESS.lz_compress
(src => l_in_blob, dst =>
l_compressed_blob);
-- Uncompress the string
UTL_COMPRESS.lz_uncompress
(src => l_compressed_blob,
dst => l_uncompressed_blob);
-- Compare the results with the input
DBMS_OUTPUT.put_line('Input length is
: ' || LENGTH(l_in_blob));
DBMS_OUTPUT.put_line('Compressed length
: ' || LENGTH(l_compressed_blob));
DBMS_OUTPUT.put_line('Uncompressed length: ' ||
LENGTH(l_uncompressed_blob));
-- Caller responsibility to free up temporary
LOBs
-- See Operational Notes in the documentation
DBMS_LOB.FREETEMPORARY(l_in_blob);
DBMS_LOB.FREETEMPORARY(l_compressed_blob);
DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);
END;
/
Input length is
: 52
Compressed length
: 67
Uncompressed length: 52
PL/SQL procedure successfully completed.
Note that the compressed length is longer than the input length.
Adding a few more characters to the input string yields the following:
SET SERVEROUTPUT ON
DECLARE
l_in_blob
BLOB;
l_compressed_blob
BLOB;
l_uncompressed_blob
BLOB;
BEGIN
-- Set some values
l_in_blob
:= TO_BLOB(UTL_RAW.CAST_TO_RAW
('This is a long string of words used for this
example.
Now is the time for all good men to come to the
aid of their country'));
l_compressed_blob
:= TO_BLOB('0');
l_uncompressed_blob := TO_BLOB('0');
-- Compress the string
UTL_COMPRESS.lz_compress
(src => l_in_blob, dst =>
l_compressed_blob);
-- Uncompress the string
UTL_COMPRESS.lz_uncompress
(src => l_compressed_blob,
dst => l_uncompressed_blob);
-- Compare the results with the input
DBMS_OUTPUT.put_line('Input length is
: ' || LENGTH(l_in_blob));
DBMS_OUTPUT.put_line('Compressed length
: ' || LENGTH(l_compressed_blob));
DBMS_OUTPUT.put_line('Uncompressed length: ' ||
LENGTH(l_uncompressed_blob));
-- Caller responsibility to free up temporary
LOBs
-- See Operational Notes in the documentation
DBMS_LOB.FREETEMPORARY(l_in_blob);
DBMS_LOB.FREETEMPORARY(l_compressed_blob);
DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);
END;
/
Input length is
: 122
Compressed length
: 113
Uncompressed length: 122
PL/SQL procedure successfully completed.
That little bit extra for the input string pushed it over the top in
terms of having the compression take any real effect.
Note that the LZ_COMPRESS subprogram is overloaded since more than one
signature method can be used to invoke it. The quality parameter is
set to a default of 6. This parameter provides a trade-off between
speed of compress and quality of compression. It takes quite a few
more words or length of input before there is a difference in what the
quality input does.