data storage for Oracle
In traditional relational theory, the internal
representation of the data on the physical blocks is not supposed to matter,
but in the real world, the placement of the data on blocks is critical.
Oracle provides tools like sorted hash clusters to group related rows
together and row-sequencing can dramatically improve the performance of SQL
queries by placing all information on a single data block.
Using Oracle cluster table, you can even group related
tables together on the same data block.
For example, if you have a busy OLTP database where millions of
people query customer and related
order rows all day long:
An un-clustered table has high I/O overhead
use Oracle cluster tables to put the customer and order rows together on a
single data block, greatly reducing the number of trips to the database to
fetch the desired result set.
The placement of rows on physical data blocks makes
a huge performance difference
See my note here on
Oracle 11g table compression and note the tradeoff between run-time
performance of the SQL, vs. the processing overhead of compressing and
de-compressing the rows.
Source: Oracle Corporation
point is that this type of compression requires overhead, and the less
volatile the table, the better the overall performance.
11gR2 column level data storage for Exadata servers
These column oriented database have another significant
advantage, because they store adjacent column data, they can use compression
algorithms to detect patterns in the columns and achieve very high rates of
data compression. This packs
more data onto each data block, making data warehouse queries run even
But what about applications like DSS that want data
stored with related columns on the data blocks?
While traditional database system wants to group related data items
together, data warehouse applications prefer to see related columns of data
grouped together on the data blocks.
In a OLAP or DSS system, we are analyzing "facts",
individual column values, and we rarely needs row-oriented data display.
Hence, it does not make sense to physically store data in rows format.
Instead, we break-apart the rows and store the column values adjacent to one
another on the data blocks.
Guy Harrison of Quest Software has
this great illustration of rows storage vs. column storage on Oracle data
Source: Guy Harrison
This is very foreign to most relational databases, and
column oriented database are similar to indexes, in that they store ROWID
and column value pairs together on the data blocks.
These column oriented
databases were first implemented of guru's like professor Michael
Stonebreaker at MIT with their
While the 11g compression is row-level compression we
now see a Oracle 11g Release 2 new feature called 11g column-level
This option is only available on the Oracle/HP Exadata
storage devices, million dollar firmware disks that are optimized for data
The column level compression for a table has syntax like this:
Note the archive
level syntax. According to
the Oracle documentation, the archive
level arguments s[specifies the amount of compression, and this is
directly related to the processing overhead at compress-decompress time.
Guy Harrison of Quest Software performed
a benchmark on the 11g Release 2 compression with these results:
Today, it appears that this new column-level data storage is only
available on the million dollar Exadata storage boxes being sold by Oracle
and HP. If you attempt to define column-level compression on a non
Exadata server to get this error message:
ORA-64307: hybrid columnar compression is
only supported in tablespaces residing on Exadata storage
Cause: An attempt was made to use hybrid columnar compression on
Action: Create this table in a
tablespace residing on Exadata storage or use a different compression type.