What people are saying - Write a review
We haven't found any reviews in the usual places.
Other editions - View all
370 architecture addressing modes analysis assembly language basic benchmark jobs bits branch distances branch instructions branch prediction cache line cache miss penalty COBOLC COBOLGO compiler conditional branch Count Product Ratio Displacement Value effect error evaluation example execution distance Figure formulas FORTC FORTGO FORTRAN fraction frequency hardware IBM Amdahl RATIO IBM/Amd Tins implementation Inst Instr Count Product Instr Instr Count instruction length instruction set instructions executed JMP CC,xxx jump LINSY2 load loop M*Tmiss main memory measured microprocessor nsec number of bytes number of cache number of registers opcode distributions Opcode Pairs operand length operating system optimized overlapped P-Code PASCAL PDPll performance pipeline PL/I PL1C PL1GO processor program counter represent sequences simulation SNOBOL subroutine SVC correction Table Tacc Tcross tions Tpred traced Trun Trun-Tpred Tsvc two-operand instructions usec variables
Page 158 - Using a Computer to Design Computer Instruction Sets", Carnegie-Mellon, May 1968 PhD Thesis [HEH] Hehner, ECR , "Matching Program and Data Representations to a Computing Environment", Computer Systems Research Group, University of Toronto, Report CSRG-44, November 1974. [HUG] Hughes, JH , "A Functional Instruction Mix and Some Related Topics", International Symposium on Computer Performance Modeling Measurement and Evaluation, Cambridge, Mass., March 1976.
Page 48 - The results presented here are derived from the analysis of seven benchmark jobs written at SLAC. Except for one (LINSY2) they were all production jobs written for purposes other than performance evaluation. To avoid biasing the results with artifacts from specific languages or programs, we purposely chose the three most used language compilers and programs compiled by them. (1) FORTC is a compilation by the IBM Fortran-H optimizing compiler. (2) FORTGO is the execution of the FORTRAN program compiled...
Page 49 - Model validation Verification basically consists of comparing the time predicted by our model for each benchmark job with the corrected real execution time. The time predicted for each benchmark, Tpred, consists of the following terms: Tins, the total time predicted from the timing formulas, which does not include the cache miss penalty. M * Tmiss, where M is the number of cache misses as reported by the cache simulator, and Tmiss is the cache miss penalty. The number of cache misses includes the...
Page 36 - ... of non-linear formulas are sufficiently infrequent to justify this special treatment, but the effect on timing values is too important to ignore them. A simpler approach would assume that the product of the averages is a sufficient estimate of the average product, but the potential error is great. The formulas are encoded as a string of records, each corresponding to the coefficient of a term in a subcase of a timing formula for a particular instruction; there are a total of 3200 variable names...
Page 50 - ... and Tmiss is the cache miss penalty. The number of cache misses includes the effect of SVC execution on the cache contents. Tcross, the time penalty, for Amdahl only, paid when references to the cache cross a line boundary. The penalty is two cycles (.065 usec) for reads and three cycles (.0975 usec) for writes, and is computed using numbers provided by the cache simulator. Virtually all the penalty arises from instruction fetch, since none of the programs access unaligned data. There is no equivalent...
Page 34 - ... register-to-register arithmetic or logical instructions. ADD REGISTER (AR) IBM Amdahl .080 usec .065 usec Many formulas have a simple linear dependency on execution variables. An example is a Load Multiple (LM) instruction which can be expressed as Load Multiple (LM) IBM Amdahl .520+.080*R usec .065+.065*R usec where R is the number of registers loaded. Some formulas may involve variables which are concerned with the general environment of the instruction. These are often measures of the effect...
Page 84 - For FORTC, for example, the 6% overlapped MVCs account for 52% of the MVC time. Table 15 is the distribution of operand length for MVC instruction in FORTC. It is representative of the other distributions in the presence of large peaks for small values, and an overall average of 10.06 bytes. Since the startup overhead for these instructions is large, there is almost always a less expensive way to do the equivalent operation for a small number of bytes. For one byte, a IC/STC combination takes less...
Page 30 - ... the time penalty for the misses is too large to be neglected. If the miss ratio is 5%, with a 480 nsec penalty for a miss, 2 memory requests per instruction, and an average instruction execution time of 300 nsec (reasonable values for the 370/168) then the time for the cache misses represents 16% of the execution time. Two other cache organization features must be considered in the cache penalty correction. For IBM, stores always access main memory ( "store-through" ) which may cause extra delays....
Page 82 - ... among the most frequent, they contribute much more to the CPU time than their frequency would suggest because of their long execution time. For the FORTGO program for example, the 0.67% of instructions which are STM account for 6.66% of the IBM execution time and 4.59% of the Amdahl execution time. Character Instructions. The second group of storage-to-storage (SS) Instructions are those which specify a source and destination location for a character string and a single length for both operands...
Page 35 - ... linear in their variables. Typical examples are the decimal arithmetic instructions, where the duration depends on the product of the lengths or the average value of the digits used. For these we compute the appropriate products of variables at the time the program is analyzed, and average these values for use by the other programs in an equivalent linear form. These cases of non-linear formulas are sufficiently infrequent to justify this special treatment, but the effect on timing values is...