A virtual column is an expression based
on one or more existing columns in the table. While a
virtual column is only stored as metadata and does not
consume physical space, it can be indexed and also contain
optimizer statistics and histograms. Oracle 11g has
included the support for partitioning a table using a
partitioning key on a virtual column. Prior to 11g,
the partitioning key was limited to using physical columns.
Therefore, in order to partition a table by using a derived
value, a DBA would have been required to create and populate
an additional physical column in order to achieve the same
result. The derived value would then either need to be
populated by the application or by a trigger that evaluates
the expression before insertion. In either case,
achieving this goal in 10g required additional overhead and
increased disk space for the physical column.
The ability to use an expression as a
partitioning key provides a more efficient way to meet
comprehensive business requirements without incurring
unnecessary overhead. This new feature can be very
useful when a table cannot be partitioned by the existing
data columns. Consider a table that contains a list of
accounts such as:
desc accounts
Name
Null? Type
------------------------------------------------------------------
ACCOUNT_NUMBER
VARCHAR2(30)
ACCOUNT_NAME
VARCHAR2(30)
CONTACT_PERSON
VARCHAR2(30)
select
*
from
accounts;
ACCOUNT_NUMBER
ACCOUNT_NAME
CONTACT_PERSON
---------------------
--------------------
-----------------
3983-9339-1232-1292
N-JOHNS-INDUSTRIALS
JOHN
8778-5435-5345-5223
E-MATTEL-AUTOMOTIVE
MIKE
2432-6543-2244-0877
S-SOUTHERN-TRANSPORTS DOUG
4333-3424-6564-1322
W-GLOBAL-DISTRIBUTION GERRY
Consider also that the business requires
this table be partitioned by region. The
account_name
column in this table is prefixed by a single character
representing the account's region, followed by a dash and
the account holder's name (i.e. N-JOHNS-INDUSTRIALS is North
region). Prior to 11g, in order to partition this
table by region, an additional physical column representing
the region would have needed to be added to this table.
However, in Oracle 11g, a virtual column can be used to
create an expression that represents the region such as
'substr(account_name,1,1)?.
In order to partition this table by
region, the table can be defined with region as a
virtual column and partition by this derived column:
create table
accounts_v (
account_number varchar2(30),
account_name varchar2(30),
contact_person varchar2(30),
region AS (case
when
substr(account_name,1,1) = 'N' then 'NORTH'
when
substr(account_name,1,1) = 'E' then 'EAST'
when
substr(account_name,1,1) = 'S' then 'SOUTH'
when
substr(account_name,1,1) = 'W' then 'WEST'
end)
)
partition by
list (region)
(
partition pN values ('NORTH'),
partition pE values ('EAST'),
partition pS values ('SOUTH'),
partition pW values ('WEST')
);
After inserting the same four rows with
values for the
account_number,
account_name, and
contact_person, the virtual column region is
automatically derived when querying the table.
insert all
into accounts_v (account_number, account_name,
contact_person)
VALUES ('3983-9339-1232-1292', 'N-JOHNS-INDUSTRIALS',
'JOHN')
into accounts_v (account_number, account_name,
contact_person)
VALUES ('8778-5435-5345-5223', 'E-MATTEL-AUTOMOTIVE',
'MIKE')
into accounts_v (account_number, account_name,
contact_person)
VALUES ('2432-6543-2244-0877', 'S-SOUTHERN-TRANSPORTS',
'DOUG')
into accounts_v (account_number, account_name,
contact_person)
VALUES ('4333-3424-6564-1322', 'W-GLOBAL-DISTRUBTION',
'GERRY')
select 1 from dual;
select
*
from
accounts_v;
ACCOUNT_NUMBER
ACCOUNT_NAME
CONTACT_PERSON REGION
---------------------
--------------------
---------------- -------
3983-9339-1232-1292
N-JOHNS-INDUSTRIALS
JOHN
NORTH
8778-5435-5345-5223
E-MATTEL-AUTOMOTIVE
MIKE
EAST
2432-6543-2244-0877
S-SOUTHERN-TRANSPORTS DOUG
SOUTH
4333-3424-6564-1322
W-GLOBAL-DISTRIBUTION
GERRY
WEST
In order to verify the partitioning of
this table, use the following query:
SELECT
TABLE_NAME, PARTITION_NAME, PARTITION_POSITION, HIGH_VALUE
FROM
DBA_TAB_PARTITIONS
WHERE
TABLE_NAME='ACCOUNTS_V'
ORDER BY
PARTITION_NAME;
TABLE_NAME PARTITION_
PARTITION_POSITION HIGH_VALUE
--------------- ---------- ------------------ ----------
ACCOUNTS_V PE
2
'EAST'
ACCOUNTS_V PN
1
'NORTH'
ACCOUNTS_V PS
3
'SOUTH'
ACCOUNTS_V PW
4
'WEST'
To verify the placement of records in
the appropriate partitions, query a specific partition.
The following query is for the accounts in the east region:
select
*
from
accounts_v partition (PE);
ACCOUNT_NUMBER
ACCOUNT_NAME
CONTACT_PERSON REGION
---------------------
------------------------- ---------------- -------
8778-5435-5345-5223
E-MATTEL-AUTOMOTIVE
MIKE
EAST
In order to demonstrate the disk space
advantages of using a virtual column instead of a physical
column, define a table that is synonymous to
accounts_v and
uses a physical column for region:
create table
accounts_p (
account_number varchar2(30),
account_name varchar2(30),
contact_person varchar2(30),
region varchar2(5)
)
partition by
list (region)
(
partition pN values ('NORTH'),
partition pE values ('EAST'),
partition pS
values ('SOUTH'),
partition pW values ('WEST')
);
Notice that region column has been
changed from a virtual column to a varchar2(5) physical
column. To compare the disk space consumption between
accounts_p and
accounts_v,
populate each table with 100,000 rows. Before doing
so, truncate accounts_v
to reset the high watermark.
truncate table accounts_v;
DECLARE
i NUMBER := 1;
BEGIN
LOOP
insert into accounts_v (account_number, account_name,
contact_person)
VALUES ('3983-9339-1232-1292', 'N-JOHNS-INDUSTRIALS',
'JOHN');
insert into accounts_p (account_number, account_name,
contact_person, region)
VALUES ('3983-9339-1232-1292', 'N-JOHNS-INDUSTRIALS',
'JOHN', 'NORTH');
i := i+1;
EXIT WHEN i>100000;
END LOOP;
COMMIT;
END;
In order to compare disk space
consumption, gather statistics on both of these tables:
exec dbms_stats.gather_table_stats('VJ', 'ACCOUNTS_V');
exec dbms_stats.gather_table_stats('VJ', 'ACCOUNTS_P');
Now, query the data dictionary to
compare how many blocks each table is using:
select
table_name,
num_rows,
blocks
from
dba_tables
where
table_name in ('ACCOUNTS_V', 'ACCOUNTS_P');
TABLE_NAME NUM_ROWS BLOCKS
--------------- -------- ------
ACCOUNTS_V 100000
748
ACCOUNTS_P 100000
874
Notice that both tables have identical
data and partitions. However, the table using a virtual
column to derive the region column consumes significantly
less disk space than the table that uses a physical column
for storing the region. This is attributed to the fact
that virtual columns are part of the metadata and do not
consume disk space.
Virtual column-based partitioning
improves the capability to partition tables that do not have
an obvious partition key. This new feature avoids
suboptimal alternatives to derive the partition key.
This being, using triggers or modifying application code to
evaluate and insert additional physical columns is not
necessary. Use of virtual columns, instead of physical
columns, also improves the disk space efficiency.
Virtual column based partitioningallows data partitioning to be based on virtual
columns as the partitioning key. Virtual columns are a new feature
in 11g that allow derivation from a function or expression
evaluation results. As such, values for virtual columns are not
stored within the table, but rather the values for the virtual
columns are evaluated on demand when calculated. Indexes and
tables can be partitioned as such on new virtual columns.
For instance, to add a new virtual column to an existing table,
SCOTT.EMP, one would execute the following statement.
SQL> ALTER TABLE EMP ADD (yearly_sal
as (sal*comm));
Table altered.
SQL> desc emp
Name Null? Type
----------------------------------------- --------
----------------
EMPNO NOT NULL NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
YEARLY_SAL NUMBER
SQL>
select ename, sal, comm, yearly_sal
2 from emp;
ENAME SAL COMM
YEARLY_SAL
---------- ---------- ---------- ----------
SMITH 800
ALLEN 1600 300 480000
WARD 1250 500 625000
JONES 2975
MARTIN 1250 1400 1750000
Now when the above query is issued against the
SCOTT.EMP table, there will be the new virtual column, yearly_sal,
that can be used in calculations.
If a new table needs to be created that uses virtual columns,
issue a CREATE TABLE like that found in the following example.
SQL> CREATE TABLE virtual_emp (
2 empno NUMBER PRIMARY KEY,
3 ename VARCHAR2(20) NOT NULL,
4 social_id NUMBER(9),
5 sal NUMBER(7,2),
6 hrly_pay NUMBER(7,2) GENERATED ALWAYS AS (sal/2000));
Table created.
Note that the key syntax to use when creating tables that make
use of virtual columns is the generated always phrase. An optional
syntax parameter, virtual, can
be used in addition when creating virtual columns, as shown in the
example below.
SQL> CREATE TABLE v_emp
2 (name VARCHAR2(20),
3 hr_rate NUMBER(7,2),
4 income NUMBER(7,2) generated always as (hr_rate*2080)
5 virtual
6 );
Virtual column partitioning limitations
Virtual columns have the following limitations:
- They cannot be created on user defined types LOB or RAW
- Delete or insert operations cannot be performed on virtual
columns. If one attempts to do so, Oracle will generate an
ORA-54013 error.
- They cannot be created on index-organized tables(IOT), external table, temporary tables, objects, or
clusters
- All of the virtual columns used in expressions must belong to
the same table
- They cannot be updated by the set clause in an UPDATE statement
Next to be given is an example of how to create a virtual
column based partition table in the following exercise. First,
create a new table called workers.
SQL> create table workers
2 (
3 emp_id number not null,
4 job_no number,
5 fname varchar2(20),
6 mname varchar2(20),
7 lname varchar2(20),
8 salary number,
9 bonus number,
10 position varchar2(50),
11 tcomp as (salary+bonus) virtual
12 )
13 partition by range(tcomp)
14 (partition p_10g values less than (20000),
15 partition p_20g values less than (30000),
16 partition p_50g values less than (60000),
17 partition p_100g values less than (200000),
18 partition p_500g values less than (600000),
19 partition p_more values less than (maxvalue));
Table created.
Now that the new virtual column based partition table has been
created, insert some sample test data to see how row placement in
the partitions is evaluated.
SQL> insert into workers
2 (emp_id,job_no,fname,mname,lname,salary,bonus,position)
3 values
4 (1,1,'Scott','Smith','Henry',30000,1200,'DBA');
1 row created.
SQL> insert into workers
2 (emp_id,job_no,fname,mname,lname,salary,bonus,position)
3 values
4 (2,2,'Howard','M','Ostro',250000,25000,'Director');
1
row created.
SQL>
commit;
Commit complete.
Virtual column based partitions have the same functionality as
that in range or list based partitions. Now take a peek at how the
optimizer views virtual based partitions and row position.
SQL> select fname, lname, salary,
bonus, tcomp
2
from workers partition (p_50g);
FNAME
LNAME SALARY BONUS TCOMP
-------------------- -------------------- ---------- ----------
----------
Scott Henry 30000
1200 31200
SQL> select fname, lname, salary, bonus, tcomp
2 from workers partition (p_500g);
Howard Ostro 250000 25000
27500
Here it can be seen that the partitions? return values based on
the virtual column based partitions. By viewing the execution plan
from dbms_xplan, one can view
the use of the virtual column based partitions as shown below.
SQL> explain plan for
2 select fname, lname, salary, bonus, tcomp
3 from workers partition (p_500g);
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------Plan
hash value:
475732617
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time
| Pstart| Pstop |
-------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------|
0 | SELECT
STATEMENT | | 1 | 63 | 3 (0)| 00:00:01
| | |
| 1 | PARTITION RANGE SINGLE| | 1 | 63 | 3
(0)| 00:00:01 | 5 | 5 |
| 2 | TABLE ACCESS FULL | WORKERS | 1 | 63 | 3
(0)| 00:00:01 | 5 | 5 |
-------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
9 rows selected.
The nice thing about virtual column based partitions
is that they allow the development staff to create partitions
based on business requirements when other types of partitioning
are limited in terms of flexibility and when business rules call
for partitioning based on specific needs for the application.