Neil Raden, who's article on
Model-Driven Approaches For BI Projects I linked to last
year, dropped me a line to tell me about some new real-time ETL
articles he'd written:
"Mark,
I've written some articles and white papers about real-time
data warehousing:
http://ww.hiredbrains.com/knowout.html
...but I think the really interesting part of it is not data
warehousing, per se, but abstraction and real-time analytics.
Abstraction can provide the logical-to-physical layer between
a data warehouse and a BI app, but it can also provide the
kind of rich meaning we need for our machines to do some
reasoning, something all current data warehousing and BI
concepts lack.
For that, I'm investigating the competing approaches of the
Semantic Web and Emergent Semantics, though I'm leaning toward
the latter.
With good semantics and Moore's Law, much of data warehousing
becomes irrelevant. As for BI, most of it is parlor tricks.
I'm hoping to see a new batch of tools that can reason and
learn, at least in the limited domain commerce, supply chain,
CRM, etc. That's where real time will show some returns."
Apart from the
list of Neil's
ETL and data warehousing articles, there's a couple of good
(free) e-Books linked to on Neil's site, including one on
ETL and Data Integration (including articles on Kimball vs.
Inmon, the model-driven BI project and two on real-time data
warehousing), and another one specifically about
real-time data warehousing. You have to go through an
annoying registration process to get the books, and they're
Windows-only executables, but it looks like there's some
interesting content there. See also
"Implementing Real-Time Data Warehousing Using Oracle 10g"
on DBAZine.
Real-Time data warehousing is an interesting area, and one
that's addressed by some of the new features in OWB "Paris" -
the ability to accept data from Advanced Queues and web
services, and the ability to publish out to the same, such that
you can publish an OWB mapping that "listens" for ETL data and
then transforms and hands it off in real-time. The reality
though, at least as far as I've experienced in the UK, is that
the market isn't really clamouring for this at the moment, at
least not in any volume. What is of interest though is reducing
the ETL load time down to as close to zero as possible, with as
little impact as possible on users who are accessing the system,
and it's this requirement that's driving my interest in this
area. Whilst it's still probably a while off before a
significant number of OWB users use technologies such as web
services and AQs to process their ETL jobs, RDBMS technologies
such as external tables, table functions/pipelining and change
data capture are already getting take-up and are starting to
become a normal feature of OWB projects.
Must also take a proper look at this "semantics" stuff as
well - it's been cropping up a lot recently and I need to get a
better understanding of what this is all about...