Chapter 7. Primary test scenarios 125
If using WebSphere is not an option, this functionality can be implemented in
other ways. For example, it can also be implemented by using DB2 Tables.
However, in this case the control and responsibility for transaction (message)
delivery and staging rests with the user. We tested this type of methodology here
in Scenario 2, as an example.
In this test we read the transactions (messages), that performed either inserts or
updates to the data warehouse tables, from a DB2 staging table. We applied
them to the data warehouse star schema tables by using continuously running
SQL stored procedures (SP). We used one SQL SP to read the update/insert
records from the customer dimension staging table and make the appropriate
change to the data warehouse customer dimension table. We used another SP
to read the insert records from the fact-item staging table and insert the records
into the data warehouse fact-item table. As with all the test scenarios, we also
had the continuously running query workload executing concurrently with the
data warehouse update workloads.
7.2.1 Test definition
The test here is to demonstrate that DB2 tables can support continuous update
of the data warehouse along with a concurrent continuously running query
workload. We measured the impact of both the update streams and the query
streams in terms of throughput, and watched how that throughput would change
as we varied the commit count and the isolation level. We then analyzed the test
results to see if we could achieve a good balance that between the impact to the
throughput while applying the realtime update workload and the impact to the
continuously running query workload.
The test consisted of several baseline and test-runs. The baseline runs were to
obtain a base elapsed time for the query workload, while running without any
concurrent data warehouse updates. We also developed base update workloads
to obtain base elapsed times, without any concurrent running query workload.
One thing to note about the query times is that some of the queries are sensitive
to the size of the database and, as a result of adding data to the fact table, the
times will increase during a run due to having to read a greater volume of data.
Therefore, some of the impact to the elapsed time of the query stream could be
attributed to this. To better understand this impact, we ran several query stream
baselines against the warehouse at different size levels that we expected to
create. We measured at the starting base size, after adding 150K rows, after
adding 250K rows and after adding 800K rows. Up through the 250K
measurement, there was no significant change in the elapsed time of the query
streams. However, at the 800K level, the elapsed time roughly doubled, with the
increase isolated to the larger queries that might pick up some of the newly