Getting the Best Performance out of DEC/ORACLE DBMS.

Two objectives are discussed in this paper: high yield read write (update) and read (retrieval) transactions.

Rightsizing Databases. The database sizing methodology is clearly the most important of the database administration and management tasks. Databases which are created too small are victim to poor performance due to database extents. Databases which are created too large, may "waste" disk space and cause unnecessary delays when running lists and reports. The optimal configuration is to create storage areas that are maintained at a constant size by regular data archiving. As older data is deleted, free space is made available for the storage of new records.

Areas containing static records (seldom changes) should not exceed 75% fullness and areas containing dynamic records (frequently changes) should not exceed 65% full. The size of the storage area is determined by the size of the database page. Under most CA-MANMAN configurations, the administrator can approximate the database page size by using the following formula; a database page should hold 3 or more frequently stored records, at least two larger (least stored) and 1 of the smaller record types. For most configurations, this page sizing formula works extremely well, and when combined with the SPAM Thresholds formula (later in this paper), provides for high yield updates and fast end-user response times.

What SPAM Thresholds are and how to configure them:

SPAMs are likened to an index which contain an inventory of pages and a number (percentage) for each page it manages. The threshold value for a page guaranties that space is available for new records providing that the threshold has not been reached. By checking the SPAM page first, the database control system (DBCS) can locate a data page with adequate free space in as few as two I/Os and store the record. When threshold values are incorrectly set, the DBCS may bypass pages regardless of the free space on the page.

EXAMPLE:

Assume that SPAM Thresholds for a storage area are 70%, 85% and 95%. The physical size of the database page is 2 blocks/page (2 x 512 bytes or 1024 bytes). If Snapshots are enabled for the database, 60 bytes of overhead is required. If Snapshots are not enabled, 38 bytes of overhead is required. The calculated free space on any data page in the area is (1024 - 60) 964 or (1024-38) 986 bytes. Actual free space for each data page can be obtained from the $DBO/DUMP/OPTION=DEBUG log. Locate the MAX_FREE_LEN fields in the dump log to determine the amount of free space (bytes) for the page.

The maximum record sizes for defaulted SPAM Threshold percentages are listed below. (Note that there is 964 bytes of free space on the page).

Table#1

0 (0‑70%) can guarantee space for a 295 byte record,

1 (71‑85%) can guarantee space for a 147 byte record,

2 (85‑95%) can guarantee space for a 98 byte record,

3 (95‑100%) guarantees no free space on the page, these pages are never checked.

A 300 byte record could not be stored using SPAM Thresholds using the table above, all pages in the storage area would be scanned until free space was found.

CALCULATING THRESHOLDS:

Obtain the average length of the most frequent record, longest record and the shortest record. This information can be found in the DBO/ANALYZE log files. Do not include system records.

1) Set the lowest threshold value so that it guarantees that pages that haven't reached this percent fullness, can still store the largest record. The goal is to be able to store 1 largest record on most data pages in the storage area.

2) The middle threshold value should be set to a value that guarantees that pages that haven't reached this value can store at least a largest record and one more of the smaller types.

3) The last threshold value should be set to a value where none of the frequent record types can be stored.

4) Increase the storage area blocks/per/page size if more than two frequent records and a large record cannot be stored on the page. This ensures that data pages contain enough room to store the majority of the records without fragmenting.

MODIFYING THRESHOLDS:

SPAM thresholds can be altered by using the DBO/MOD/THRES=(x1,x2,x3) command. All new records stored to the database immediately begin to use the new threshold values. By default, the DBO/MOD/THRESHOLD command executes in EXCLUSIVE UPDATE mode and must run alone in the database.

VERIFYING PERFORMANCE:

The DBO/Show Statistic utility displays the performance efficiency of database thresholds for a given database. Review the Record I/O Statistic screen and change the screen to Numbers mode. Review the “pages checked” and the “record stored” statistic. Divide the “records stored” value by the “pages checked” value. The result reveals the database’s I/O efficiency for store operations (PC/RS).

ACTUAL Case Studies:

Case Study #1: An unload and reload was scheduled for a company over a two day weekend. The system configuration is a VAX 6320 processor with 168 Mbyte of memory and 4 Gbytes of disk. The database contained 2.3 million records and was 1.5 million blocks in size. Prior to the unload/resize operation the "pages checked/record stored" (PC/RS) ratio was observed as high as 250:1 (250 pages checked for each record stored) for production jobs. PC/RS ratios were observed between 25:1-30:1 for interactive processes. The database storage areas were extended and system and database performance was severely degraded. An initial attempt to unload and reload this (MANDB) database was aborted after 85 hours into the load operation. Two months later another attempt was made to unload and reload the same database. Prior to the second unload, an analysis was performed using the Database ToolKit and optimized SPAM Thresholds and storage area resize calculations were obtained. Storage areas were modified to the ToolKit recommendations and SPAM Threshold values. The reload operation yielded an impressive PC/RS ratio of 1.648:1. The elapsed time for the entire unload and reload was just over 65 hours. Eight weeks later, the PC/RS ratios continued to perform efficiently at 1.7:1 for interactive and 1.9:1 for batch processes.

Case Study #2: A DEC/ORACLE DBMS Administrator could not take the time for an unload/reload of the database to correct degraded transaction response times caesed by record fragmentation and database extents. This database encountered PC/RS ratios of 20:1 for interactive processes and up to 40:1 for batch processes. Storage areas whose pages exceeded 70% full were restored with a larger blocks/page value. The storage areas were restored with sizes recommended by the eNVy Database Toolkit (a software program), new Thresholds were calculated and interactive DBQ was used to correct the record fragmentation. The result: an immediate improvement in transaction response time for store performance was reported by the users (A/P and Shipping). The DBO/Show Statistic utility quantified the improved performance with sustained PC/RS ratios of 1.5-2:1 and 0 fragmented fetches.

Configuring Local and Global Buffers:

Local Buffers. The local buffer pool is that memory which is accessible only to a single process. The size of the pool is determined by the product of buffer size and the number of buffers. The buffer length (/length_buffers) should be a common multiple of all area files of the database. Common multiple values insure that full-sized database pages are copied into the buffer pool during a single I/O. Common multiple buffer sizes can reduce the pool overflow rate as displayed by DBO/SHOW STATISTICS (PIO).

$DBO/MOD/LENGTH_BUFFER=n <Database>

$DEFINE/LOG DBM$BIND_BUFFERS n

Cautionary note: If there is insufficient memory to support the number of buffers selected, the system will perform virtual paging to the system page file. Virtual paging significantly impacts process and system performance and should therefore be avoided. Rule of thumb: When in doubt, create smaller buffers (/Length) and increase the number of buffers (DBM$BIND_BUFFERS).

Global Buffers. The global buffer pool is that memory which can be simultaneously accessed by multiple users on a single node. This facility is only available in DBMS versions 4.3a and above. Global buffers can vastly reduce disk I/O to the database by retaining prior read data from all database users in a common buffer pool. For databases where most transactions are READ, global buffers are effective in improving response time. Gobal buffers are not effective on database which have a higher mix of READ WRITE (Update) and READ transactions. Frequent buffer locking and backing store flushes reduce shared buffer efficiency. Additional global pages and global sections may be required to implement Global Buffers.

Cautionary note: Memory poor system configuration should NOT implement Global Buffers.

Rule of Thumb for Optimizing DBMS Global Buffers:

Step #1

Memory: 128 mByte

Reserved Global Buffers: 10 Percent

Buffer Size in kBytes: .003 (6 * 512)

Number of Buffers: 4000

(128 * .10) = 12 = 4000

.003 .003

Step #2

Number of Buffers from Step #1 4000

Number of Users 50

Maximum Buffers Per User 80

4000 = 80

Tune buffers by increasing the SYSUAF DIOLM to be greater than or equal to the number of database buffers in use by the process. ASTLM should be 12 greater than DIOLM. ASTLM needs to be increased because more locks may be required for additional buffers. PGFLQUOTA and BYTLM may need to be increased to facilitate the increased virtual and physical memory requirements.

Adjustable Lock Granularity. Adjustable Lock Granularity (ALG) capability enables the DBCS to request as few locks as possible for the database. Depending on process contention for database records, the DBCS will adjust the level of locking from the greatest number pages to the fewest until all selected pages or the single record can be locked. Lock granularity can be configured for the database using the command DBO/MOD/ADJUST=(level 1, level 2, level 3,...). Level 1 signifies the fewest number of pages that a single lock could be promoted for a range of pages in the database. Level 2 consists of (level 1 pages * level 2 pages) and level 3 signifies (level 1 * level 2 * level 3) pages and so on. Adjustable locking significantly reduces the amount of locking required for databases where high reads are required. For multi-user databases, adjustable locking can severly impact database performance when there is a fair mix of read and read write transactions. Although there may be a greater number of locks with /NOADJUST, lock contention is minimized as only the accessed records are locked. The following figure describes adjustable lock tree.

When ALG is configured with the defaulted values of (10,10,10), the transaction initially requests a lock at the highest level (level 4). If there is no contention (all locks are compatible) for the area, the DBCS promotes a single lock for the transaction. If another transaction, with an incompatible lock request, needs access to the same records, the lock for the first transaction is modified to a lock which is compatible with the new transaction and de-escalated to level 3. If another transaction requires access to a record in the range which was previously locked by the level 3 transaction, the lock on level 3 is again modified to a compatible lock of the new transaction and de-escalated to level 2, etc. When ALG is enabled, the above transaction may promote, demote or convert a minimum of 5 locks to access a single record. When ALG is disabled, locks are promoted, demoted and converted on individual records as needed.

To analyze a locking problem, DBO/Show Statistic command. Look at the number of Blocking AST messages. If this number is greater than 20 - 25% of the number of locks requested. The database may perform better using /NOADJUSTABLE locking.

If deadlocks occur (and they usually do if the database is heavily used for read and read write transactions) and the BATCH RETRIEVAL is not used, you can lower SYSGEN parameter DEADLOCK_WAIT to detect the deadlock situation more quickly. The default parameter is 10 seconds and is a dynamic parameter and can be changed at any time on the active system. Lower the parameter to no less than the number of active CPU's in the cluster. For example, in a 4-node cluster, the DEADLOCK_WAIT parameter can be set to no lower than 4 seconds. For the same cluster, if a cluster member has two CPU's, the DEADLOCK_WAIT parameter could be set to no lower than 5.

BATCH AND CONCURRENT RETRIEVAL. Significant transaction performance gains have been observed when utilizing DEC DBMS Snapshots and BATCH RETRIEVAL (BR) in place of CONCURRENT RETRIEVAL (CR). This section quantifies the effect on system performance of DBMS Snapshots using metrics and observations.

A quick note about BR: Implementing BR may require significant programming effort as CA-MANMAN is not written to fully utilize DBMS Snapshots for all commands. Depending on your environment, heavy transaction demands may force snapshot files to grow and require 10-15% more disk space.

A company running CA-MANMAN upgraded DEC DBMS from 4.2c to 5.1-1. An undocumented feature was added to 5.1 whereby CR transactions are treated by the DBCS as READ/WRITE "CONCURRENT UPDATE" (CU) transactions. The additional CPU, I/O and lock overhead significantly degraded cluster performance. We identified the degradation to be attributed to RUJ creation, root file contention [Transaction Sequence Number (TSN) and Run-Time Unit Process Block (RTUPB)] and I/O to access the after image journal files (.AIJ). It was evident that the cause of the slowness was due to the additional system resources and overhead required by v5.1-1 on a CPU that was already moderately loaded.

To quantify our conclusions and identify corrective action options, a program was written to establish a baseline metric for general response times and identify overhead for various parts of the transaction. We felt that by analyzing the transaction at various stages, we might be able to optimize it. The test program ran against a single disk (JBOD), Striped and SSD drives. It was configured for CONCURRENT and BATCH retrieval modes. Transaction activation (the elapsed time in which a database bind "INVOKE" and a READY completes), and rundown (the elapsed time in which a ROLLBACK or a COMMIT completes) took longer using CR than BR.

Table #1 Media Comparison Table

Batch

Concur

No RUJ

RUJ:SSD

RUJ:STRIPE

RUJ:RF72

READY

elapsed (sec)

0.8

CPU time (sec)

0.1

0.12

Direct I/O (opns)

Buffered I/O(opns)

ROLLBACK

elapsed (sec)

0.1

0.5

0.4

CPU time (sec)

<.01

The typical MANMAN list and report command may execute between two and five transactions (READY statements). UDMS RPW performs one transaction for each subschema that is referenced.

The impact of CR vs BR on the user doing the transaction can be calculated by taking the above figures and multiplying by the number of commands executed per day, times the number of transactions per command. The impact on other users is harder to calculate, but will be much greater. CA-MANMAN's MM$TIMER logs can report the frequency at which commands are executed.

The time spent for one user to complete a CR READY (at least three extra seconds) is also spent by other users who are waiting to start their transactions, since they are also locked out of the database for that duration. This and other delays were noted throughout the CR transaction. At ROLLBACK and image rundown, the DBCS promotes a lock on the RTUPB in the root file to delete the process header information, deletes the associated RUJ and updates the AIJ. This overhead is compounded by the next user when he is free to proceed with the activation or rundown causing the others to wait further, a geometric delay. This activity occurs regardless of ready mode. This can be observed in DBO/SHOW STATISTICS as RWROOT stall time. Details about this statistic can be referenced in the "Locking (one stat field)" screen, Blocking AST screen.

When users depress the CONTROL-C and abort any CA/MANMAN transaction, all other transactions in progress are frozen for the duration of the ROLLBACK (extra 1/3 second over BR). This delay adds linearly per user. This statistic was observed by using DBO/SHOW STATISTICS "Freeze stall time."

* The BR statistics were uniform regardless of RUJ medium. This is the expected behavior, as BR creates no RUJ.

We further measured transaction delays associated with root file locking by observing the affects of the RTUPB and RWROOT lock statistic when creating RUJ's on different media. We used the DBO/SHOW STATISTIC utility "Locking (one stat field)" screen for our test. We used the Blocking AST screen for this test since VMS locks are promoted and demoted via blocking ASTs. Focusing on the RWROOT and the RTUPB lock statistic, we measured the relative delay attributed to locks on the root file. By dividing the "total count" column by the "average per transaction" colume, transactions spent 30% more time on Blocking ASTs when RUJ's were installed on striped volume sets versus the solid state drive. Transactions spent 35% more time on Blocking ASTs when RUJ's were created on RF72's versus the striped volume sets. We did not obtain Blocking AST statistics for BR transactions.

As a "corrective action" performance enhancement, we defined DBM$RUJ such that all RUJs are created n one of two solid state drives (1 per node). This significantly reduced delays in RUJ creation for all transactions. For complex transactions, elapsed time for activation and rundown was markedly reduced from a minute plus to just seconds. There was also a reduction in the number of Blocking ASTs and the frequent "Freeze lock" stall message caused by CONTROL-C rollbacks.

Updated:

Tim Peer is the Senior Software Engineer of eNVy Systems - San Jose, CA.

Voice. (408) 363-8896, Fax. (408) 363-8389 , Internet: peert@envysys.com