Getting
the Best Performance out of DEC/ORACLE DBMS.
Two
objectives are discussed in this paper: high yield read write (update) and read
(retrieval) transactions.
Rightsizing
Databases. The database sizing methodology is clearly the
most important of the database administration and management tasks. Databases
which are created too small are victim to poor performance due to database
extents. Databases which are created too large, may "waste" disk
space and cause unnecessary delays when running lists and reports. The optimal
configuration is to create storage areas that are maintained at a constant size
by regular data archiving. As older data is deleted, free space is made
available for the storage of new records.
Areas
containing static records (seldom changes) should not exceed 75% fullness and
areas containing dynamic records (frequently changes) should not exceed 65%
full. The size of the storage area is determined by the size of the database
page. Under most CA-MANMAN configurations, the administrator can approximate
the database page size by using the following formula; a database page should
hold 3 or more frequently stored records, at least two larger (least stored)
and 1 of the smaller record types. For most configurations, this page sizing
formula works extremely well, and when
combined with the SPAM Thresholds formula (later in this paper), provides for
high yield updates and fast end-user response times.
What
SPAM Thresholds are and how to configure them:
SPAMs are
likened to an index which contain an inventory of pages and a number (percentage) for each page it manages. The
threshold value for a page guaranties that space is available for new records
providing that the threshold has not been reached. By checking the SPAM page first, the database
control system (DBCS) can locate a data page with adequate free space in as few
as two I/Os and store the record. When threshold values are incorrectly set, the DBCS may bypass pages regardless of the
free space on the page.
EXAMPLE:
Assume that
SPAM Thresholds for a storage area are 70%, 85% and 95%. The physical size of
the database page is 2 blocks/page (2 x 512 bytes or 1024 bytes). If Snapshots
are enabled for the database, 60 bytes of overhead is required. If Snapshots
are not enabled, 38 bytes of overhead is required. The calculated free space on
any data page in the area is (1024 - 60)
964 or (1024-38) 986 bytes. Actual free space for each data page can be
obtained from the $DBO/DUMP/OPTION=DEBUG log. Locate the MAX_FREE_LEN fields in
the dump log to determine the amount of free space (bytes) for the page.
The maximum
record sizes for defaulted SPAM Threshold percentages are listed below. (Note
that there is 964 bytes of free space on the page).
Table#1
0
(0‑70%) can guarantee space
for a 295 byte record,
1
(71‑85%) can guarantee space
for a 147 byte record,
2
(85‑95%) can guarantee space
for a 98 byte record,
3
(95‑100%) guarantees no free
space on the page, these pages are never checked.
A 300 byte record could not be stored using SPAM
Thresholds using the table above, all pages in the storage area would be scanned
until free space was found.
CALCULATING THRESHOLDS:
Obtain the average length of the most frequent record,
longest record and the shortest record. This information can be found in the
DBO/ANALYZE log files. Do not include system records.
1) Set the lowest threshold value so that
it guarantees that pages that haven't reached this percent fullness, can still
store the largest record. The goal is to
be able to store 1 largest record on most data pages in the storage area.
2) The middle threshold value should be
set to a value that guarantees that pages that haven't reached this value can
store at least a largest record and one more of the smaller types.
3) The last threshold value should be set
to a value where none of the frequent record types can be stored.
4) Increase the storage area
blocks/per/page size if more than two frequent records and a large record
cannot be stored on the page. This ensures that data pages contain enough room
to store the majority of the records without fragmenting.
MODIFYING THRESHOLDS:
SPAM thresholds can be altered by using the
DBO/MOD/THRES=(x1,x2,x3) command. All new records stored to the database
immediately begin to use the new threshold values. By default, the
DBO/MOD/THRESHOLD command executes in EXCLUSIVE UPDATE mode and must run alone
in the database.
VERIFYING
PERFORMANCE:
The
DBO/Show Statistic utility displays the performance efficiency of database
thresholds for a given database. Review
the Record I/O Statistic screen and change the screen to Numbers mode. Review the “pages checked” and the “record
stored” statistic. Divide the “records stored” value by the “pages checked”
value. The result reveals the database’s I/O efficiency for store operations
(PC/RS).
ACTUAL Case Studies:
Case Study #1: An unload and reload was scheduled
for a company over a two day weekend. The system configuration is a VAX 6320
processor with 168 Mbyte of memory and 4 Gbytes of disk. The database contained
2.3 million records and was 1.5 million blocks in size. Prior to the unload/resize operation the
"pages checked/record stored" (PC/RS) ratio was observed as high as
250:1 (250 pages checked for each record stored) for production jobs. PC/RS
ratios were observed between 25:1-30:1 for interactive processes. The database
storage areas were extended and system and database performance was severely
degraded. An initial attempt to unload and reload this (MANDB) database was
aborted after 85 hours into the load operation. Two months later another
attempt was made to unload and reload the same database. Prior to the second
unload, an analysis was performed using the Database ToolKit and optimized SPAM
Thresholds and storage area resize calculations were obtained. Storage areas
were modified to the ToolKit recommendations and SPAM Threshold values. The
reload operation yielded an impressive PC/RS ratio of 1.648:1. The elapsed time
for the entire unload and reload was just over 65 hours. Eight weeks later, the
PC/RS ratios continued to perform efficiently at 1.7:1 for interactive and
1.9:1 for batch processes.
Case Study #2: A DEC/ORACLE DBMS Administrator
could not take the time for an unload/reload of the database to correct
degraded transaction response times caesed by record fragmentation and database
extents. This database encountered PC/RS ratios of 20:1 for interactive
processes and up to 40:1 for batch processes. Storage areas whose pages
exceeded 70% full were restored with a larger blocks/page value. The storage areas were restored with sizes recommended
by the eNVy Database Toolkit (a software program), new Thresholds were
calculated and interactive DBQ was used to correct the record fragmentation.
The result: an immediate improvement in transaction response time for store
performance was reported by the users (A/P and Shipping). The DBO/Show
Statistic utility quantified the improved performance with sustained PC/RS
ratios of 1.5-2:1 and 0 fragmented fetches.
Configuring
Local and Global Buffers:
Local Buffers. The local buffer pool is that
memory which is accessible only to a single process. The size of the pool is
determined by the product of buffer size and the number of buffers. The buffer
length (/length_buffers) should be a common multiple of all area files of the
database. Common multiple values insure that full-sized database pages are
copied into the buffer pool during a single I/O. Common multiple buffer sizes
can reduce the pool overflow rate as displayed by DBO/SHOW STATISTICS (PIO).
$DBO/MOD/LENGTH_BUFFER=n
<Database>
$DEFINE/LOG DBM$BIND_BUFFERS n
Cautionary note: If there is insufficient memory to
support the number of buffers selected, the system will perform virtual
paging to the system page file. Virtual paging significantly impacts process
and system performance and should therefore be avoided. Rule of thumb:
When in doubt, create smaller buffers (/Length) and increase the number of
buffers (DBM$BIND_BUFFERS).
Global Buffers. The global buffer pool is that
memory which can be simultaneously accessed by multiple users on a single node.
This facility is only available in DBMS versions 4.3a and above. Global buffers
can vastly reduce disk I/O to the database by retaining prior read data from
all database users in a common buffer pool.
For databases where most transactions are READ, global buffers are
effective in improving response time. Gobal buffers are not effective
on database which have a higher mix of READ WRITE (Update) and READ
transactions. Frequent buffer locking and backing store flushes reduce
shared buffer efficiency. Additional global pages and global sections may be
required to implement Global Buffers.
Cautionary note: Memory poor system configuration
should NOT implement Global Buffers.
Rule
of Thumb for Optimizing DBMS Global Buffers:
Step #1
Memory:
128 mByte
Reserved
Global Buffers: 10 Percent
Buffer
Size in kBytes: .003 (6 * 512)
Number
of Buffers: 4000
(128
* .10) =
12 = 4000
.003 .003
Step #2
Number
of Buffers from Step #1 4000
Number of Users 50
Maximum
Buffers Per User 80
4000 =
80
50
Tune buffers by increasing the SYSUAF DIOLM to be
greater than or equal to the number of database buffers in use by the process.
ASTLM should be 12 greater than DIOLM. ASTLM needs to be increased because more
locks may be required for additional buffers. PGFLQUOTA and BYTLM may need to
be increased to facilitate the increased virtual and physical memory
requirements.
Adjustable Lock
Granularity.
Adjustable Lock Granularity (ALG) capability enables the DBCS to request as few
locks as possible for the database. Depending on process contention for
database records, the DBCS will adjust the level of locking from the greatest
number pages to the fewest until all selected pages or the single record can be
locked. Lock granularity can be
configured for the database using the command DBO/MOD/ADJUST=(level 1, level 2,
level 3,...). Level 1 signifies the fewest number of pages that a single lock
could be promoted for a range of pages in the database. Level 2 consists of
(level 1 pages * level 2 pages) and level 3 signifies (level 1 * level 2 *
level 3) pages and so on. Adjustable
locking significantly reduces the amount of locking required for databases
where high reads are required. For multi-user
databases, adjustable locking can severly impact database performance when
there is a fair mix of read and read write transactions. Although there may be a greater number of
locks with /NOADJUST, lock contention is
minimized as only the accessed records are locked. The following figure
describes adjustable lock tree.
When ALG is configured with the defaulted values of
(10,10,10), the transaction initially requests a lock at the highest level
(level 4). If there is no contention (all locks are compatible) for the area, the
DBCS promotes a single lock for the transaction. If another transaction, with
an incompatible lock request, needs
access to the same records, the lock for
the first transaction is modified to a lock which is compatible with the new
transaction and de-escalated to level 3.
If another transaction requires access to a record in the range which
was previously locked by the level 3 transaction, the lock on level 3 is again
modified to a compatible lock of the new transaction and de-escalated to level
2, etc. When ALG is enabled, the above
transaction may promote, demote or convert a minimum of 5 locks to access a single record. When ALG
is disabled, locks are promoted, demoted and converted on individual records as
needed.
To analyze a locking problem, DBO/Show Statistic
command. Look at the number of Blocking
AST messages. If this number is greater than 20 - 25% of the number of locks
requested. The database may perform better using /NOADJUSTABLE locking.
If deadlocks occur (and they usually do if the database
is heavily used for read and read write transactions) and the BATCH RETRIEVAL
is not used, you can lower SYSGEN parameter DEADLOCK_WAIT to detect the
deadlock situation more quickly. The default parameter is 10 seconds and is a
dynamic parameter and can be changed at any time on the active system. Lower
the parameter to no less than the number of active CPU's in the cluster. For
example, in a 4-node cluster, the DEADLOCK_WAIT parameter can be set to no
lower than 4 seconds. For the same cluster, if a cluster member has two CPU's,
the DEADLOCK_WAIT parameter could be set to no lower than 5.
BATCH AND CONCURRENT RETRIEVAL. Significant transaction performance
gains have been observed when utilizing DEC DBMS Snapshots and BATCH RETRIEVAL
(BR) in place of CONCURRENT RETRIEVAL (CR). This section quantifies the effect
on system performance of DBMS Snapshots using metrics and observations.
A quick note about BR: Implementing BR may require significant
programming effort as CA-MANMAN is not written to fully utilize DBMS Snapshots
for all commands. Depending on your environment, heavy transaction demands may
force snapshot files to grow and require 10-15% more disk space.
A company running CA-MANMAN upgraded DEC DBMS from 4.2c
to 5.1-1. An undocumented feature was added to 5.1 whereby CR
transactions are treated by the DBCS as
READ/WRITE "CONCURRENT UPDATE" (CU) transactions. The
additional CPU, I/O and lock overhead significantly degraded cluster
performance. We identified the degradation to be attributed to RUJ creation,
root file contention [Transaction Sequence Number (TSN) and Run-Time Unit
Process Block (RTUPB)] and I/O to access the after image journal files (.AIJ).
It was evident that the cause of the slowness was due to the additional system
resources and overhead required by v5.1-1 on a CPU that was already moderately
loaded.
To quantify our conclusions and identify corrective
action options, a program was written to establish a baseline metric for
general response times and identify overhead for various parts of the transaction. We felt
that by analyzing the transaction at various stages, we might be able to
optimize it. The test program ran against a single disk (JBOD), Striped and SSD
drives. It was configured for CONCURRENT and BATCH retrieval modes. Transaction
activation (the elapsed time in which a database bind
"INVOKE" and a READY
completes), and rundown (the elapsed time in which a ROLLBACK or a
COMMIT completes) took longer using CR than BR.
Table
#1 Media Comparison Table
|
Batch |
Concur |
Concur |
Concur |
|
|
|
No
RUJ |
RUJ:SSD |
RUJ:STRIPE |
RUJ:RF72 |
|
|
READY |
||||||
|
elapsed
(sec) |
0.8 |
4 |
5 |
5 |
|
|
CPU
time (sec) |
0.1 |
0.1 |
0.12 |
0.12 |
|
|
Direct
I/O (opns) |
7 |
20 |
20 |
20 |
|
|
Buffered
I/O(opns) |
7 |
20 |
22 |
19 |
|
ROLLBACK |
||||||
|
elapsed
(sec) |
0.1 |
0.5 |
0.5 |
0.4 |
|
|
CPU
time (sec) |
<.01 |
<.01 |
<.01 |
<.01 |
|
The typical MANMAN list and report command may execute
between two and five transactions (READY statements). UDMS RPW performs one
transaction for each subschema that is referenced.
The impact of CR vs BR on the user doing the
transaction can be calculated by taking the above figures and multiplying by
the number of commands executed per day,
times the number of transactions per command. The impact on other users
is harder to calculate, but will be much greater. CA-MANMAN's MM$TIMER logs can
report the frequency at which commands are executed.
The time spent for one user to complete a CR READY (at
least three extra seconds) is also spent by other users who are waiting to
start their transactions, since they are also locked out of the database for
that duration. This and other delays were noted throughout the CR transaction.
At ROLLBACK and image rundown, the DBCS promotes a lock on the RTUPB in the
root file to delete the process header information, deletes the associated RUJ
and updates the AIJ. This overhead is compounded by the next user when he is
free to proceed with the activation or rundown causing the others to wait
further, a geometric delay. This activity occurs regardless of ready mode. This
can be observed in DBO/SHOW STATISTICS as RWROOT stall time. Details about this
statistic can be referenced in the "Locking (one stat field)" screen,
Blocking AST screen.
When users depress the CONTROL-C and abort any
CA/MANMAN transaction, all other transactions in progress are frozen for the duration
of the ROLLBACK (extra 1/3 second over BR). This delay adds linearly per user.
This statistic was observed by using DBO/SHOW STATISTICS "Freeze stall
time."
* The BR statistics were uniform regardless of RUJ
medium. This is the expected behavior, as BR creates no RUJ.
We further measured transaction delays associated with
root file locking by observing the affects of the RTUPB and RWROOT lock
statistic when creating RUJ's on different media. We used the DBO/SHOW
STATISTIC utility "Locking (one stat
field)" screen for our test. We used the Blocking AST screen for this test
since VMS locks are promoted and demoted via blocking ASTs. Focusing on the
RWROOT and the RTUPB lock statistic, we measured the relative delay attributed
to locks on the root file. By dividing the "total count" column by
the "average per transaction" colume, transactions spent 30% more
time on Blocking ASTs when RUJ's were installed on striped volume sets versus
the solid state drive. Transactions spent 35% more time on Blocking ASTs when
RUJ's were created on RF72's versus the striped volume sets. We did not obtain
Blocking AST statistics for BR transactions.
As a "corrective action" performance
enhancement, we defined DBM$RUJ such that all RUJs are created n one of two
solid state drives (1 per node). This significantly reduced delays in RUJ
creation for all transactions. For complex transactions, elapsed time for
activation and rundown was markedly reduced from a minute plus to just seconds.
There was also a reduction in the number of Blocking ASTs and the frequent
"Freeze lock" stall message caused by CONTROL-C rollbacks.
Updated:
Tim
Peer is the Senior Software Engineer of eNVy Systems - San Jose, CA.
Voice. (408) 363-8896, Fax. (408) 363-8389
, Internet: peert@envysys.com