Your Progress OpenEdge Database Expert

Some Surprising Benefits of Using Storage Areas

Some Surprising Benefits of Using Storage Areas

Related presentation given at Exchange 2003 in Boston: some-surprises.ppt

Overview

Progress Storage Areas are usually viewed as an administrative tool rather than as a means for directly improving performance. Faster and more selective index rebuild and dump/load, the ability to easily drop "temp areas", running dbanalys much more quickly and other benefits are all great reasons to use storage areas. Less well known is that many of the same factors that enable improved manageability also impact performance in interesting ways.

Recently I was involved in a project to migrate an existing v9 database to a Storage Area configuration. The results were in some ways quite surprising.

Creating A New Database

The database had been converted from v8 to v9 several years ago using conv89. The source database at the time of conversion consisted of 110GB in 400 tables. The largest table was around 15GB and consisted of 180 million records. Several other tables were between 2GB and 6GB.

The target database design consists of 80 storage areas -- Every table with more than 1,000,000 records is allocated two storage areas; one for data one for the indexes. Other tables go into one of two "miscellaneous" storage area data/index pairs for the "small stuff" (the separation is based on a natural partitioning of the data within the application view of the data).

(1,000,000 records was an arbitrary decision based on where the line between big, heavily used tables and smaller, less frequently used tables seemed to be.)

I used 24 dump processes against the old db with a -B of 180,000 (8k blocks) and nothing else special. The source db was on EMC 8830 storage, split between 2 4-way "meta volumes" (EMC speak for striped disks). The .bd files were dumped to another 4-way meta.

Extra large bi extents were created and grown ahead of time (on their own 4-way meta). No table approaches 64GB in size and there are no pathological record update situations to accomodate so all storage areas are set to 256 Records Per Block. Extents are alternated between 2 filesystems (one for each of 2 4-way metas). Large files are enabled.

(Note: In this particular case 256 RPB was used across the board. At the time that was thought to be a reasonable approach -- things have changed and it is no longer the recommended best practice. Current best practice is to optimize RPB and the data within each Storage Area to each other.)

To perform the load a server is running against the target with -i -B 20000. 47 load processes then wait for binary dumps to complete (1 for each dedicated area plus half a dozen for each of the "misc" areas). They point their SRT files at yet another 4 way meta. Most of the time there were only half a dozen or so loads actually processing data. The binary loads are performed with the index build option enabled.


The server is an HP rp7410 with 8x875mhz CPUs and 16GB of RAM.

The dump started just before midnight Friday/Saturday. Most tables were complete by 6am the "big 6" were running alone by noon. (Interesting note -- their IO rate stayed pretty constant and fairly low after Saturday am. 50 or 60 IO ops per second which strikes me as an area open for improvement -- this seems to be a consistent feature of binary dumps across many platforms and many disk configurations.) The last 3 tables completed within seconds of each other at around 2:30am Sunday. By 3:15am the load processes were done.

Total time to dump and load was less than 27.5 hours. Another hour was spent verifying the results ensuring that everything that was dumped was also loaded.

Some results were as expected:

                             OLD        NEW
    db size                  110 GB      64 GB
    tabanalysis             10.0 hr     1.0 hr
    data warehouse extract    90 min     15 min

The database has many small records and when it was first converted to 8k blocks it grew by about 50% (the performance improvement was well worth the growth in size) so we expected to reclaim a great deal of that space and we did. The improvement in tabanalys performance is a result of data blocks being more efficiently used -- both the increased number of records per block and the homogenous nature of the data blocks helps here. The same benefits are seen in the data warehouse extract -- a series of 15 parallel processes extracts the data and then merges the result (the merge phase uses UNIX shell scripts.) One of the expected benefits of the reorganization was better performance of data extracts and reporting -- both because of the usual benefits from dumping and loading and the gains from concentrating most of the data into "pure" data blocks.

These were the main reasons for doing the conversion. It was expected that it would impact background and reporting types of tasks in this way. The benefits were predictable and easy to measure.

Surprising Results

After the dump and load the workload as measured by record reads, commits per second and application metrics was at normal levels and quite consistent with those prior to the d&l. This provided an excellent opportunity to study the before and after impact of the conversion.



A surprising and somewhat unexpected benefit developed. Prior to the conversion the database had been averaging approximately 400 "db reads" per second over the day with sustained periods of 700 to 1,000 reads/sec or more. The buffer hit ratio varied between 94% and 98%. Response time on the EMC was reasonable although not stellar but that level of IO was a concern -- particuarly when taking into account the possibility of growth.


Post conversion the number of db reads dropped dramatically to a much more consistent average of around 30 per second!

Some improvement was expected. Increasing the records per block combined with a small average record size would mean more records can be held in the same -B buffer. Thus each IO would be that much more efficient in much the same way that the database size is reduced. But concentrating the heavily used records into fewer blocks seems to have had a much larger benefit than was expected.

Latch timeouts also decreased dramatically -- from a wildly varying average of 12 per second over the whole day (with much higher spikes) to less than 1 per second.


The degree of improvement in these two key database performance indicators was a very pleasant outcome. It isn't often in the technology world that a surprise takes the form of good news!

Greenfield Technologies knowledge of business, applications, and infrastructure helps companies to develop and deploy applications which are built to last and designed to exceed user expectations.

-- Rob Lux
Enterprise Services Manager
Large Global IT Outsourcing Firm

With technology evolving at an increasingly challenging rate, it’s great to have a partner that you trust, and one that you can leverage to help your business take advantage of a constantly changing technology landscape. Greenfield Technologies has been there for us in the past, and will be THE partner we go to in the future when we need in-depth expertise.

-- Todd Lunsford
CIO
Quicken Loans

Greenfield Technologies in depth knowledge of the Progress database and our application made it possible to not only prepare our hardware, operating system and Progress software upgrade to a point that we felt very comfortable to go ahead with it, but also enabled us to execute it in less time than anticipated and resulted in a much larger performance improvement than we expected! Tom’s motto to prepare well and test twice beforehand paid off fully.

-- Gabriela Summerer-Herndon
Unix Admin, Progress DBA
Columbia National Inc.

We just watched! You deserve the credit! Thanks again!

-- Alex Hillman

Thank you for your extraordinary efforts during the past few days. All of us really appreciate it. Given our volume and customer service requirements, your support -- which extended far beyond the normal work day and schedule -- was invaluable.

-- Jenne Britell

Thank you again for going the "extra mile".

-- Ben Smith

Tom, you especially have gone beyond the call of duty in monitoring our system and getting issues regarding capacity etc resolved.

-- Matt White

Great program! Great features!.

-- Scott Cooper

Thank you for your work on the [...] rehosting project. Expediting the conversion of the Progress Database was critical to our success. The knowledge that you brought to the team about Progress tuning and database management helped not only with this effort but will improve our on-going management of the database. Thank you!

-- Anonymous CIO


ProJAX

ProJAX is an implementation of AJAX designed to get Progress developers, especially those working in legacy environments, up and running with a minimum of muss or fuss. ProJAX makes it simple to leverage your existing Progress 4GL programming skills to deliver rich and responsive web applications without annoying delays and timeouts for page refreshes.


Have a question?
Don't know where to look?

Contact Us!

Address: White Star Software
PO Box 3058
Nashua, NH 03061
Cell: +1 603 396 4886
E-mail: mailwss.com
wss.com