Introduced in Oracle 10g, the UTL_COMPRESS package can be used to
compress and uncompress large objects (raw, blob or bfile). For all
practical purposes, think of UTL_COMPRESS as PL/SQL's means of
compressing or zipping files. When looking at the various procedures
and functions, many are prefixed with LZ.
The algorithm used to perform the compression is in
UTL_COMPRESS is based on the Lempel-Ziv XE "Lempel-Ziv"
algorithm and when used, the code is based on the LZW or
Lempel-Ziv-Welch implementation.
Oracle's documentation for
the
UTL_COMPRESS package is sparse in terms of showing a
full-scale example. Sparse is not even the correct word here.
Nonexistent is a better descriptor for
UTL_COMPRESS documentation. A MetaLink note (249974.1) shows
two examples and various pages on the Internet can be searched.
The actual implementation of the
UTL_COMPRESS package is pretty easy to use once an example is
seen. Many of the examples shown elsewhere include a length comparison
among the input, the compressed, and the uncompressed lengths.
Ideally, the input and uncompressed lengths should be the same. You
may find that the compressed length is longer than the input length.
This occurs when the input length is small or short. The overhead of
building a dictionary can make the compressed length longer than the
input length.
In this simple example, we create one's own
input, compress it, uncompress it, and evaluate the lengths of each.
SET SERVEROUTPUT ON
DECLARE
l_in_blob
BLOB;
l_compressed_blob
BLOB;
l_uncompressed_blob
BLOB;
BEGIN
-- Set some values
l_in_blob
:= TO_BLOB(UTL_RAW.CAST_TO_RAW
('This is a long string of words used for this
example'));
l_compressed_blob
:= TO_BLOB('0');
l_uncompressed_blob := TO_BLOB('0');
-- Compress the string
UTL_COMPRESS.lz_compress
(src => l_in_blob, dst => l_compressed_blob);
-- Uncompress the string
UTL_COMPRESS.lz_uncompress
(src => l_compressed_blob, dst =>
l_uncompressed_blob);
-- Compare the results with the input
DBMS_OUTPUT.put_line('Input length is
: ' || LENGTH(l_in_blob));
DBMS_OUTPUT.put_line('Compressed length
: ' || LENGTH(l_compressed_blob));
DBMS_OUTPUT.put_line('Uncompressed length: ' ||
LENGTH(l_uncompressed_blob));
-- Caller responsibility to free up temporary LOBs
-- See Operational Notes in the documentation
DBMS_LOB.FREETEMPORARY(l_in_blob);
DBMS_LOB.FREETEMPORARY(l_compressed_blob);
DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);
END;
/
Input length is
: 52
Compressed length
: 67
Uncompressed length: 52
PL/SQL procedure successfully completed.
Note that the compressed length is longer than the
input length. Adding a few more characters to the input string yields
the following:
SET SERVEROUTPUT ON
DECLARE
l_in_blob
BLOB;
l_compressed_blob
BLOB;
l_uncompressed_blob
BLOB;
BEGIN
-- Set some values
l_in_blob
:= TO_BLOB(UTL_RAW.CAST_TO_RAW
('This is a long string of words used for this
example.
Now is the time for all good men to come to the aid
of their country'));
l_compressed_blob
:= TO_BLOB('0');
l_uncompressed_blob := TO_BLOB('0');
-- Compress the string
UTL_COMPRESS.lz_compress
(src => l_in_blob, dst => l_compressed_blob);
-- Uncompress the string
UTL_COMPRESS.lz_uncompress
(src => l_compressed_blob, dst =>
l_uncompressed_blob);
-- Compare the results with the input
DBMS_OUTPUT.put_line('Input length is
: ' || LENGTH(l_in_blob));
DBMS_OUTPUT.put_line('Compressed length
: ' || LENGTH(l_compressed_blob));
DBMS_OUTPUT.put_line('Uncompressed length: ' ||
LENGTH(l_uncompressed_blob));
-- Caller responsibility to free up temporary LOBs
-- See Operational Notes in the documentation
DBMS_LOB.FREETEMPORARY(l_in_blob);
DBMS_LOB.FREETEMPORARY(l_compressed_blob);
DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);
END;
/
Input length is
: 122
Compressed length
: 113
Uncompressed length: 122
PL/SQL procedure successfully completed.
That little bit extra for the input string pushed it over the top
in terms of having the compression take any real effect.
Note
that the LZ_COMPRESS subprogram is overloaded since more than one
signature method can be used to invoke it. The quality parameter is
set to a default of 6. This parameter provides a trade-off between
speed of compress and quality of compression. It takes quite a few
more words or length of input before there is a difference in what the
quality input does.
As an example, take the text about quality
in the documentation and use that as the input string, and vary the
quality from 1 to 9. If this block of text is used as an example, be
sure to remove the single quotes around quality near the end.
"Quality is an optional compression tuning value. It allows the
UTL_COMPRESS user to choose between speed and compression quality,
meaning the percentage of reduction in size.
A faster compression
speed in UTL_COMPRESS will result in less compression of the data. A slower
compression speed will result in more compression of the data. Valid
values are [1..9], with 1=fastest and 9=slowest. The default
'quality' value is 6."
At a setting of 1, the compressed length is 248, and at 9, the length
becomes 245. The strength or utility of using UTL_COMPRESS comes into
play when dealing with truly large objects and not just experimenting
with contrived strings. For this use case, use a document, but then, a
table will also be needed.
The setup steps are to create a table to hold the BLOB (Binary Large
Object), create a directory with read/write for the user, put a file
into the directory, initialize the record, insert a BLOB, compress it
with
UTL_COMPRESS,
and compare the lengths.
set serveroutput on
create table compress_blob (indx integer, y
blob);
create directory MYDIR as 'C:\temp';
--We assume the user doing this will have
read/write on MYDIR
--Copy a file into the directory, e.g.,
A_57KB_Word_doc.doc
--This block will take care of the insert for
you, or you
--could create a separate procedure to do this
--This is simple code, does not address other
PL/SQL errors
DECLARE
ablob blob;
abfile bfile := bfilename('MYDIR',
'A_57KB_Word_doc.doc');
-- Gets a pointer to the file.
a_compressed_blob blob;
amount
integer;
asize
integer;
quality integer := 9;
cursor blob_cur is select * from
compress_blob;
BEGIN
--
-- compress_blob table is initialized with one
record because
-- the PL/SQL BLOB locator (ablob) must point to
a specific
-- EXISTING NON-NULL database BLOB.
--
-- initialize the blob locator
insert into compress_blob values (1,
empty_blob());
select y into ablob from compress_blob
where indx = 1;
-- open the bfile and get the initial
file size
dbms_lob.fileopen(abfile);
asize := dbms_lob.getlength(abfile);
dbms_output.put_line('Size of input
file: ' || asize);
-- load the file and get the size
dbms_lob.loadfromfile(ablob, abfile,
asize);
dbms_output.put_line('After
loadfromfile');
asize := dbms_lob.getlength(ablob);
dbms_output.put_line('Size of blob: '
|| asize);
-- compress the blob
-- you can experiment with varying the
quality
a_compressed_blob :=
utl_compress.lz_compress(ablob, quality);
-- insert the compressed blob
insert into compress_blob values (2,
a_compressed_blob);
-- compare the sizes of the blobs in
the table
dbms_output.put_line
('Sizes before and after
insertion/compression -->');
for c1_rec in blob_cur
loop
asize := dbms_lob.getlength(c1_rec.y);
dbms_output.put_line(asize);
end loop;
end;
/
Size of input file: 57856
After loadfromfile
Size of blob: 57856
Sizes before and after insertion/compression -->
18722
57856
PL/SQL procedure successfully completed.
When using a quality value of 1, the compressed size was 20868, so a
value of 9 represents a 10% or so improvement. As file sizes increase,
so will the amount of compression.
For production-type code, the developer will want to handle
UTL_COMPRESS errors
such as TOO_MANY_ROWS, NO_DATA_FOUND and DUP_VAL_ON_INDEX. For
BLOB-related errors, a set of exception handlers, as shown in
documentation like what follows, could be packaged up for reuse.
EXCEPTION
when UTL_COMPRESS.INVALID_ARGUMENT
then
dbms_output.put_line('An argument was
an invalid type or value.');
when UTL_COMPRESS.BUFFER_TOO_SMALL
then
dbms_output.put_line('Compressed
representation is too big.');
when UTL_COMPRESS.DATA_ERROR then
dbms_output.put_line('Input or output
data stream has invalid format.');
when UTL_COMPRESS.STREAM_ERROR then
dbms_output.put_line('Error during
compression/uncompression of the data
stream');
when others then
dbms_output.put_line('An exception
occurred');
dbms_output.put_line(sqlcode ||
sqlerrm);
Preprocessing checks such as checking if open (then closing) and vice
versa could also be incorporated. Do not forget to use
DBMS_LOB.FREETEMPORARY to release locked items, including the source
file. For more information on
UTL_COMPRESS from the DBA's perspective, please refer to this
utility in Chapter 6.
|
|
|
Get the Complete
Oracle Utility Information
The landmark book
"Advanced Oracle
Utilities The Definitive Reference" contains over 600 pages of
filled with valuable information on Oracle's secret utilities.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|
|
|