Another one of our clients is
looking to put some hardware in place for a new data warehousing
project. The two hardware options are:
- Buy a single, large Unix server (probably a HP box running
HP-UX) with 8 CPUs
- Buy two or more commodity "Lintel" boxes, with 2 or 4
processors, running Red Hat Advanced Server, and use RAC to
create a single database instance.
But just how suitable is Real Application Clusters for data
warehousing? The reason I started doubting RAC's suitability was
an article I came across entitled
"Oracle Real Application Clusters and Data Warehouse
Applications" which stated the following:
"Oracle has marketed the Real Application Clusters (RAC)
option as appropriate for all types of applications, but there
is one very important exception, Oracle data warehouses.
Oracle RAC is used primarily for continuous availability of
mission-critical systems (i.e. banking applications) and for
transparent scalability of massive online transactions systems
(i.e. Amazon). However, it is not always the best architecture
for data warehouse that require large materialized view
rollups because they often require large, cohesive RAM regions
(large data sorts) and tightly-coupled CPU’s (fast parallel
query) ... In sum, Oracle Real Application Clusters is a
wonderful tool for mission-critical database that must have
continuous availability (using RAC with Transparent
Application Failover (TAF)) and for scalability of super-large
OLTP systems, Oracle RAC is not the best choice for data
warehouse applications that require high-speed table scan
performance."
Now I know Don and Mike have spent quite a bit of time
working with RAC, so I thought it wise to look into this in a
bit more detail. What I did in the end was
post a link to the article on the
Dizwell Forum, and ask the forum participants for their
experiences with RAC and data warehousing. I won't repeat the
whole thread (which went on for three pages, so it's obviously
an area of interest) but if you're considering RAC for a data
warehousing project,
take a look at the thread and
Don's article, as they're both useful bits of background
reading. My conclusion? A large, single server and a collection
of clustered servers aren't interchangeable concepts; there
could be issues with RAC with "edge cases", for instance (as
mentioned in the forum comments) where a parallel query needs to
run over multiple nodes, and the "large cohesive memory area"
issue I still think could be an problem if you need more memory
than a single small server can provide. The lesson I guess
though is that there's no "black and white" rule for this sort
of thing (not that the original article was suggesting this) and
it's worth getting a bit of peer advice if you're looking to get
an answer to a question such as this, with presumably big
implications if you get it wrong.