Friday Fine-Tune: Memory and disk behavior
January 27, 2017
By Jeff Kubler
Along with the relationship between your CPU measurements and overall performance, memory and disk make up the other two components of your HP 3000 performance picture. Main memory is the scratch pad for all the work that the CPU performs. Every item of data that the CPU needs to perform calculations on or updating to must be brought into main memory.
The CPU must manage memory. It must cycle through the memory pages, marking some as Overlay Candidates (this means that new data from disk may be placed here), noting that some are in continued use, and swapping others out to virtual or what is called transient storage. Swapping to disk occurs when data is in continued use but a higher priority process needs room for its data.
To accommodate this higher priority process and its need for memory space, the Memory Manager will swap the memory for the lower priority process out to disk. The more activity the Memory Manager performs, the more CPU it takes to do this. Therefore it is the percentage of CPU used to manage memory that we use as a measurement.
A Page Fault occurs each time a memory object is not found in memory. The threshold for the number of Page Faults per second that can be incurred before a memory problem is indicated varies with the size and the power of the CPU. Larger machines can handle more Page Faults per second while a smaller box will encounter problems with far fewer. We have found that the number of Page Faults per second a system can endure without problem rises with the relative performance rating of the machine.
An exceptional number of Page Faults should never be used as the sole indicator of memory problems but when observed should be tested with the memory manager percentage. If both agree, you have a memory shortage. There are some strange things that I have observed with Page Faults, so it does not stand alone as an indicator of memory shortage.
The number of Page Faults per second and the amount of CPU needed to manage Memory are always evaluated in conjunction with each other. That is to say the high Page Fault Rate will not be considered a problem if the Memory Manager Percentage is not above 4 percent.
The Disk Environment is referred to as Secondary Storage. This is where all the data needed for system use is stored. Since Main Memory is not large enough to store all of the data that will be needed by all the processes, there must be a location for this larger pool of data. Even though the Disk Environment does not have the significance it once had, this area can still be a bottleneck. As the CPU speeds increase, bottlenecks become more significant.
Several different factors can affect the Disk Environment. One of these is data locality. Data locality pertains to two different types. There is data locality within Image datasets and data locality across the disk itself.
Data locality across disk: This refers to the location of separate pieces of files (called extents). When files are placed on the disk, they can be placed in contiguous sectors or sections of files, or they can be placed in non-contiguous locations or even on many different disks. When files are not in contiguous locations they are said to be fragmented. The advantage of contiguous location is that greater efficiencies are allowed in retrieving data. When files need to be read, the head movement of the disk drive is minimal if files are in contiguous locations. The head moves to the location and the retrieval begins.
As the disk fills up the system cannot find one contiguous location to build any new file. Therefore, the system breaks the file up into extents and places the file wherever it can. A system reload will put files back into contiguous location (usually back on the location of the files file label) or products such as Lund Performance Solutions De-Frag/X can be used to put the files back into contiguous location.
Operating systems allocate disk space in chunks as they create and expand files and transient disk space (swap areas, etc.). When files are purged, these chunks are released for reuse. Over time the disc space may end up fragmented into many small pieces, which can slow the performance and the reliability of the system.
To observe and correct MPE fragmentation, you can use the De-Frag/X product from Lund Performance Software or use the Contigvol command of MPE/iX's Volutil program. The latter creates contiguous free disk space on a volume. Contigvol work about as well as VINIT CONDense did -- that is, it's stable and reliable, but requires multiple passes to get the best results.
Data locality within IMAGE datasets is the other area of major concern. There there are two different types of datasets to be concerned with, detail datasets and automatic or master sets.
The Detail Datasets
This type of set holds the day to day data input. Detail sets begin with nothing in them. When records are added 1 is added to something called the high-water-mark, a number that tells how many records have been in the set, and the record is placed in the set.
The problem is that IMAGE automatically reuses space that is given up when a record is deleted. This space is often called the delete chain. New records are placed in the most recent location available on the "delete chain." This means that new records are not in the same physical locality as the rest of the records and may be far removed from the other records.
The ideal state for a detail database is one where the detail entries are sorted by the key field. This allows the data to be retrieved in the smallest amount of IOs making efficient use of the MPE systems pre-fetching of data. When this is not the case we can measure the dataset lack of efficiency with something called the Elongation factor. This is simply a measure of how many more IOs the user must perform to retrieve desired data.
The Master Datasets
These have unique identifiers (field names). There are two types of master sets, a manual master and an automatic master set. Manual masters have user-entered master entries while automatic masters have automatic entries placed in them to accommodate access to detail records. The issue of importance to performance here is something called the hashing algorithm. This is the method used by the database to calculate the location of the next record placed in the database. The intent is to cause the master set to be as equally distributed as possible.
The hashing algorithm uses the size of the set in its calculation. A poor size or a size that is not large enough will result in an unequally distributed database. A poor size is most easily described as one that does not consist of a prime number. This means that when the hashing algorithm calculates a location there is a higher potential that a record will already exist in that location. When this happens a secondary position must be calculated. When secondaries are placed in another block within the database, another IO must occur to retrieve needed data. Since IO to disk is the slowest type of access, we want to avoid this at all costs.