 |
|
Introduction to Oracle Scalability Features
Oracle Database Tips by Donald Burleson |
By Steve Karam, the world's youngest Oracle ACE
and Oracle certified Master.
Introduction to Oracle Scalability Features
There is a great children's classic by Watty Piper called
The Little Engine That Could. In this story, oodles of toys for all the
good boys and girls need to get up a huge hill, but no engine is up to the
task. Finally, a little blue engine comes along and takes up the load, getting
up the hill while chanting the hopeful mantra ?I think I can, I think I can I
think I can.?
Unfortunately, Oracle databases don't work this way. You
can chant ?I think I can? until you?re blue in the face, but the database will
continue to burst, your performance will continue to drop, and the anxious pack
of managers hovering behind you awaiting a fix will continue to grow. In other
words, the engine will fail before the top of the hill and your data will come
to a screeching halt with naught but a puff of steam. To keep your database
running smoothly through any obstacle, you must tune.
This concept is known as scalability. A scalable database
is one that has room to grow, that has been tuned for growth, and one that will
not buckle and break when unanticipated growth occurs. All of these
requirements together may sound near impossible, but they can be achieved with
proper monitoring techniques.
The real key to scalability is to stay on top of how your
database is going to grow. If you know that many developers will be creating
new queries against existing data, you can tune for additional queries and take
action to make sure all new queries are properly tuned before going into
production. If you know that new products will be created on a regular
schedule, and that these new products will require new tables and indexes, you
can manage your storage in preparation for additions and develop scripts to
monitor the growth. Plan, plan, and plan some more.
Planning For Growth
But what makes a good plan? It is not enough to look at a
bunch of tables, and then add indexes to every column you find. By doing this,
you could hurt your database more than help.
In order to plan, there are a few tools that will help us.
?
Statspack or the Automatic Workload Repository (AWR) available in
Oracle 10g
?
Monitoring scripts
?
Oracle Advisors
?
Excel or another spreadsheet program, especially one with graphing
capabilities
Chapter 2 will go over storage and server growth planning,
as well as implementing benchmarking tools such as AWR or Statspack. We will
also talk about how to take the information you gain from these reports and plug
them into Excel for great graphing capabilities for trend planning.
This is one of the most important things you can do. Saving
and charting your important statistics not only helps you plan the future of
your database, it also helps you with one of the chief concerns of the DBA:
accountability.
Let's say you have a 100GB filesystem made for your Oracle
datafiles, and that filesystem is at 70%. Based on trends, you?ve seen that the
filesystem grows at a rate of 5GB every two months.
You are now in a position to submit early reports and
requests to management for new disks. Just keeping metrics such as your current
size at regular intervals allows you to proactively plan for the future.
Your management may ask you if you can simply clean up some
old data; reorganize some indexes, shrink a tablespace or three. At this point,
you should be able to refer to your chart and show where you had done that in
the past and carefully benchmarked it, and that on average it only frees up
1-3GB worth of space.
Now for accountability. If you are not able to get the
hardware you requested due to budget constraints or any other hang-ups, in eight
months when your filesystem reaches 90% full, you can not only show your
previous request for more disk space, but the research that you had and have
kept up to date showing why the necessity was so great. Being a good DBA means
being on top of your game, and with a fast growing database, keeping growth
trend information is a key component to that.
But what do we do when we have a high amount of data, high
amounts of users and their queries, and performance starts to degrade?
Unfortunately, buying new hardware isn't always the solution, as in the case
above where we were running short on space. In this case, we must tune for
future eventualities, furthering our goal of not only being on top of our game,
but mastering it.
Chapter 3 will go over what to do about high amounts of
users, queries, and transactions. The following chapters will all cover the
Oracle features you can use to tune for this growth and put your plans into use.
Scalability Features
There are several features in Oracle that can really
optimize your database for high amounts of data and transactions. These
features include:
?
Transaction Scaling
o
Shared Server
o
PGA Initialization Parameters
o
Profiles
o
System Resource Manager
?
Data Scaling
o
Materialized Views
o
Partitioning
o
Parallel Query
o
Segment Advisor
But what can we do with these features? What exactly are
transaction or data scaling, when are they required, and why?
The great thing is, we will know when by carefully planning
the future of our database. With the information we will be storing for future
reference, we will also know why it needs to occur.
For instance, we may know from our metrics that users have
been steadily increasing the amount of batch jobs they are running, thereby
running up the load on the database. This may tell us that if we don't impose
limits on the CPU consumption of these end users in the next two months, the
server will no longer be able to handle the load.
In this single example, we see the what, the where, and the
why. In the future chapters we will flesh out the scripts and methods that can
be used to collect extremely pertinent information that can be used to calculate
all these things.
Most importantly, we will learn exactly how to use all the
features at our disposal in order to make our database hum.
Summary of Scaling Features
Oracle's many features can be very daunting to some DBAs.
When a DBA is placed in charge of a data warehouse or otherwise large database,
this feeling can grow even worse. For scalability purposes we can narrow down
our feature list quite a bit.
Transaction Scaling
Transaction scaling is something a DBA must do in order to
accommodate large numbers of users, transactions, and/or queries.
On a UNIX or Linux system, Oracle generates processes for
every client connection made to the database. These are called 'server
processes,? and they are your agent for getting things done. When a SQL*Plus
connection is made, for instance, a server process will spawn on the server
itself and carry out your queries, inserts, updates, etc.
Every server process requires a certain amount of RAM and
CPU in order to start up and run. When too many processes exist on a server,
the server's loads may rise. In addition, the server may run out of physical
memory and start swapping constantly, producing horrible response times.
In addition, these end users are performing queries, some
good and some bad, that could be doing hundreds of 'sort operations.? When a
sort occurs, Oracle must define a space either on RAM or on disk where the sort
will take place; it is your job as a DBA to ensure that it happens in RAM if at
all possible.
Some of these users may be running batch jobs, or large long
running transactions that hog CPU, processors, and more. In order to combat
this, we may want to impose limits on the amount of CPU users can attain, as
well as the amount of users that perform batch jobs logging in at any given
time.
Shared Server
Oracle Shared Server was once called Multi-Threaded Server
(MTS) because of its ability to 'thread? many processes into one. In a shared
server environment, you will utilize dispatchers and shared server
processes for end user connections. Instead of having a separate process
for each end user connection, many connections will be able to conduct their
work through a single process on the server.
This method is not optimal for batch processes or DBA
actions. If you are trying to do a database backup, you would not want it to be
behind dozens of end user queries; you want to be dedicated. Likewise for the
batch users performing long running processes. They would end up clogging a
dispatcher with their process, when they could have been dedicated instead and
saved the resources.
In a Shared Server environment, processes must stand in
line. Process control is load balanced between your dispatchers and shared
servers, which helps keep waiting down for the end users.
A dispatcher is responsible for taking the request and
adding it to a queue, as well as keeping the connection to the end user open. A
shared server process will pick up the queued request and execute it against the
database, then insert the result into a queue that the dispatcher will read.
Finally, the end result will be returned to the user. Oracle manages the
process seamlessly to keep latency down and performance high.
PGA Initialization Parameters
PGA stands for Program Global Area, and if you?re not using
Shared Server as mentioned above, this is where all of your in-memory sorting
will take place. Sort operations occur in a variety of situations, such as
index reorganizations, order by clauses, and group by clauses.
When these actions occur, Oracle will attempt to perform all
row sorting in RAM. The location of this sorting is called the PGA. If the
sort operation can't occur here, it will go to disk in a Temporary Tablespace.
Performance on disk is, of course, much worse than the performance of RAM;
therefore we will want to tune away from disk whenever possible.
This task isn't as simple as it sounds. There are many
internal variables within Oracle that control just how much RAM a user can take,
even if they need more. We will go over these and show how they can be used for
optimal performance.
Profiles
Profiles allow the DBA to control how much resource an end
user can consume. In addition, they add password management into Oracle, but
that is a topic for another book.
With profiles, you can limit the amount of sessions a user
is allowed to have, thereby cutting back on a single user's ability to drive CPU
and RAM needs through the roof. You can also limit the amount of time a user
can be connected to your instance, or how long they can be idle, for instance
sitting at the prompt.
The rest of the performance-restricting measures in profiles
are duplicated into two categories: per session and per call.
For instance, you can limit the amount of CPU time a user
may have per connected session before they must exit and re-login, or instead
the amount of CPU time a user may have per transaction they perform. You can
also limit the amount of logical reads in this manner.
System Resource Manager
Resource Manager is Profiles on steroids. While profiles
can create simple limits for single users, a resource manager plan can tell the
Oracle instance exactly how to manage all your users, groups of users, system
resources, and much more.
Using resource manager plans, a DBA can define exact
priorities for different users by assigning them to groups. A group can be
assigned CPU priority, and the exact percentage of CPU they are allowed to take.
Resource manager can be even more advanced. For instance,
you are able to create very dynamic plans that will change a user's priorities
based on their actions. If a user is performing a task that the resource
manager estimates will take longer than a pre-specified time, it can lower the
user's CPU priority, or even kill the process outright. This can help immensely
when you have many ad hoc queries or huge reports.
Data Scaling
Data scaling is what a DBA must do in order to prepare for
high disk usage. Tables, indexes, CLOBs and BLOBs, they all add up. If you?re
not careful, you will be left with data volumes that are 99% full at all times
and so much data you can't even index it without taking your applications down
for a couple days.
Where it gets real tricky is in the situation I mentioned in
the Preface of the book. When you manager asks, ?It's only a billion rows, why
is it slow?? what will you say?
Hopefully you can say that you will improve it, no problem,
because you read a fantastic book on scalability by Steve Karam. Let's see our
tools.
Materialized Views (MVs)
Materialized views play a big part in tuning your complex
queries. A materialized view is made to actually store query results for later
use, making your run-time data retrieval faster.
For instance, if you have to join multiple tables, perform
sum or count aggregations, run functions against your query columns, pre-compute
complex mathematical functions, or all of the above, materialized views are for
you! You can perform all these functions once and refresh the view
periodically, and your reports or ad hoc queries will run much smoother.
This isn't even the best part about materialized views.
With MVs, you are able to instruct Oracle to rewrite queries on the fly to use
the view instead of the base tables they are querying.
This means that if you carefully plan your materialized
views, you can create a small set of views to cover large numbers of queries.
Whenever Oracle feels that it could use the pre-aggregated or pre-computed data
in the view instead of the base table, it will do so, even if it can't use the
entire view.
Not only do you get a mechanism for faster queries, but you
also get on the fly rewrites, meaning your developers don't have to recreate all
those reports they?ve spent years on! If your company has been around over
three years, I guarantee you have a few reports that were designed by someone
who's no longer with the company and therefore unsupported. Materialized views
take care of this problem for you
Partitioning
Partitioning allows a DBA to split a single table into
several ?mini tables? called partitions. A partitioned table will be split
apart by some column, called a partition key. This key is usually a date,
allowing you to do a range, or some sort of list such as a list of states.
However, there are more ways to split a table that can be very beneficial to
your querying.
The beauty of a partitioned table is that when your query
only needs data from a specific partition, it will only select data from that
partition. Instead of querying the entire table, you can seriously cut down on
your data ranges; with good indexing, you can use this to your advantage in many
ways. Your partitions can also be put into multiple tablespaces so your can
spread your data across different volumes.
In addition to these features, Oracle has the capability to
form partitions on the fly in some cases where appropriate. There are also
extremely beneficial joining capabilities that are possible with partitions that
may not be possible otherwise.
Taking your large data and breaking it into smaller chunks
is an extremely valuable asset in your Scalability Toolbox. With proper
planning, we can use it to achieve great results.
Parallel Query
Oracle's parallel query gives you the ability to split
queries up into chunks much like partitioning splits data. However, instead of
taking the data and putting it in different places on disk, you are taking your
query and dividing its tasks onto multiple CPUs.
This can be especially beneficial when you are using
partitioning to divide your data as well. More query power is great, more
storage diversity is great, but both together allows you to tune more in depth.
Let's say you have a table with a year of historical data,
each month having its own partition. A report is going to pull the last three
months of data from this table.
Partitioning can help us to begin with because we have only
three small chunks of the table to retrieve instead of twelve; for those of you
who are not mathematically inclined, this means we can save ourselves from
querying 75% of the table.
With parallel query and partitioning combined, we can use
multiple CPUs to query those three different partitions. We?re splitting our
query into pieces, each piece using its own CPU resource and hitting its own
disk resource.
Segment Advisor
In Oracle 10g, we have the Segment Advisor. This advisor
assists us in our capacity planning tremendously.
Not only does it show us a history of growth trends (though
not as in depth as we will be able to achieve on our own, as we will see later),
but it also helps you find tables that are using up too much space.
Tables are dynamic things. They are constantly being
written, overwritten, deleted, you name it. Over time, a table becomes
fragmented from all the different actions being performed against it. If you
don't frequently truncate your tables or drop and recreate them, the data will
become 'split up? and be less accessible.
In the past, you were able to reorganize this data by
clearing it all out as mentioned above or using online reorganization with the
DBMS_REDEFINITION package. However, the segment advisor will help us find the
tables that need help and give us the commands to make it happen.
Using the segment advisor, we can give a list of tables to
analyze. The advisor will tell us which tables are in need of reorganizing.
Once we know this, there are new commands available in 10g to move rows and
shrink space, thereby saving us precious disk space and assisting in row
reorganization at the same time.
Conclusion
If you?re a DBA managing large data or user volumes and feel
overwhelmed, I hope that this chapter has given you some hope. In the chapters
to come we will go into detail on how to use these features, both technically
and in practice.
In addition, helpful scripts are present for every feature
so you can hit the ground running. Scalability is just around the corner!
 |
If you like Oracle tuning, see the book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |