Warehouse Table Partitioning And
Distributed Oracle
In data warehouses that allow cross-database
access, a very common method of distribution uses horizontal
partitioning. For example, customer service organizations commonly
allow their remote sites to maintain customer information while
maintaining a location-transparent access mode to every customer,
regardless of their physical location. Horizontal partitioning is
achieved by taking a subset of each remote site's customer table and
populating a master look-up table that is accessible from any node
in the distributed data warehouse, as shown in Figure 9.10.
Figure 9.10 Horizontal data
partitioning.
In a Unix-based distributed data warehouse,
the CRON utility can be used to schedule a periodic refresh of a
master table. CRON is a time-dependent task activation utility that
starts tasks at predetermined dates and times. An SQL script
automatically extracts customer_name from the remote site and
repopulates the master customer table, leaving the customer details
at the remote site. The Oracle SQL might look like this:
/*
Delete remote rows in the master table... */
DELETE
FROM customer@master
WHERE
location = :OUR_SITE;
/*
Re-populate the master table... */
SELECT
customer_name, ‘:our_site’
FROM customer@:OUR_SITE
AS
INSERT INTO customer@master
VALUES customer_name, site_name;
Once populated, the master look-up can be
accessed by any node and used to redirect the database query to the
appropriate remote database for customer detail, as shown in Figure
9.11.
Figure 9.11 Dynamic location
transparency.
Because of the dynamic substitution in the
SQL, a common application can be made to access any customer in a
federation, regardless of location, without making any changes to
the application code.
Dynamic transparency is especially useful
for situations where remote locations have "ownership" of data, but
a corporate entity requires access to the data at a central
location.
The Internals Of Oracle's SQL*Net
In its most basic form, SQL*Net is a
software tool that allows a network of Oracle clients and servers to
communicate transparently on top of any underlying network topology
or protocol using SQL. Although SQL*Net is a very robust and
sophisticated tool, you must appreciate the inherent complexity that
goes along with the flexibility of SQL*Net. This section provides a
no-nonsense overview of the SQL*Net architecture. All of the
examples are based on Unix.
Due to its sophisticated architecture, it is
not trivial to install SQL*Net on a client or server. For Unix
systems, the following files are necessary to operate SQL*Net 2.0:
* etc/tnsnames.ora--Used
for outgoing database requests, this file contains all of the
database names (sids) running on the processor, as well as the
domain name, protocol, host, and port information. When a new
database is added to a box, you must update this file (changes to
tnsnames.ora become effective instantly). Note that the SQL*Net
version 1.0 file equivalent is etc/oratab.
* etc/listener.ora--This
file contains a list of local databases for use by incoming
connections. When you add a new destination database to a Unix host,
you must also add it to this file.
* etc/hosts--This
file lists all of your network addresses.
* etc/services--This
file lists all of the SQL*Net services.
In version 2.0, Oracle has added several
important enhancements to SQL*Net. Aside from the badly needed bug
fixes, SQL*Net now allows multiple community access. A community is
a group of computers that shares a common protocol (such as TCP/IP
to LU6.2). In addition, the Oracle database engine now defines a
multithreaded server (MTS) for servicing incoming data requests. In
the MTS, all communications to the database is handled through a
single dispatcher. In SQL*Net version 1.0, a separate process is
spawned for each connection. These connections are easily viewed by
using the Unix ps command.
When upgrading from SQL*Net 1.0 to SQL*Net
2.0, you should be aware of subtle differences between how the two
versions handle communications (see Figure 9.12). SQL*Net version
1.0 uses an orasrv component on the destination database to listen
for incoming requests, while SQL*Net 2.0 uses a process called
tnslsnr (TNS listener). In addition, SQL*Net 1.0 cannot use the
multithreaded server.
Figure 9.12 The two versions of
SQL*Net.
When a connection is made to SQL*Net, it
passes the request to its underlying layer--the transparent network
substrate (TNS)--where the request is transmitted to the appropriate
server. At the server, SQL*Net receives the request from TNS and
passes the SQL to the database. Transparent network substrate is a
fancy phrase meaning a single, common interface to all protocols
that allows you to connect to databases in physically separate
networks. At the lowest level, TNS communicates to other databases
with message-level send and receive commands.
On the client side, the User Programmatic
Interface (UPI) converts SQL to associated PARSE, EXECUTE, and FETCH
statements. The UPI parses the SQL, opens the SQL cursor, binds the
client application, describes the contents of returned data fields,
executes the SQL, fetches the rows, and closes the cursor. Oracle
attempts to minimize messages to the server by combining UPI calls
whenever possible. On the server side, the Oracle Programmatic
Interface (OPI) responds to all possible messages from the UPI and
returns requests.
No UPI exists for server-to-server
communication. Instead, a Network Programmatic Interface (NPI)
resides at the initiating server, and the responding server uses its
OPI.
SQL*Net supports network transparency such
that the network structure may be changed without affecting the
SQL*Net application. Location transparency is achieved with database
links and synonyms.
Let's trace a sample data request through
SQL*Net. Essentially, SQL*Net will look for the link name in the
database link table (DBA_DB_LINKS) and extract the service name. The
service name is then located in the tnsnames.ora file, and the host
name is extracted. Once again, we have a two-stage process beginning
with the link name referencing the service name, then the service
name referencing the host name.
In Unix environments, the host name is found
in a host file (etc/hosts), and the Internet Protocol (IP) address
is gathered. In the following example, london_unix might translate
into an IP address of 143.32.142.3. The following four steps
illustrate how SQL*Net takes a remote request and translates it into
the IP address of a destination database.
1. Issue a remote request.
Check the database link called LONDON.