Open64 (mfef90, whirl2f, and IR tools)  TAG: version-openad; SVN changeset: 916
config_cache.h File Reference
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  MHD_LEVEL
struct  MHD

Defines

#define MHD_MAX_LEVELS   4

Typedefs

typedef enum MHD_TYPE MHD_TYPE
typedef struct MHD_LEVEL MHD_LEVEL
typedef struct MHD MHD

Enumerations

enum  MHD_TYPE { MHD_TYPE_NONE = 222, MHD_TYPE_CACHE, MHD_TYPE_MEM }

Variables

MHD Mhd
MHD Mhd_Options

Define Documentation

#define MHD_MAX_LEVELS   4

Definition at line 392 of file config_cache.h.

Referenced by MHD::First(), MHD::Merge_Options(), MHD::Next(), and MHD::Print().


Typedef Documentation

typedef struct MHD MHD
typedef struct MHD_LEVEL MHD_LEVEL
typedef enum MHD_TYPE MHD_TYPE

Description:

Unique prefix

MHD_ (meaning "memory hierarchy description")

MHD_TYPE

MHD_TYPE_NONE (not modelling anything) MHD_TYPE_CACHE (cache: misses to further out cache or main memory) MHD_TYPE_MEM (main memory: misses to a disk)

MHD_LEVEL

The parameters that describe this level of the memory hierarchy.

MHD_TYPE Type

Are we modelling cache, main memory, or nothing

INT64 Size

Number of bytes the cache or main memory holds.

INT64 Effective_Size

How many bytes the memory hierarcy level can effectively hold

INT32 Line_Size

Number of bytes in a cache line or page.

INT32 Clean_Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a clean line.

INT32 Dirty_Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a dirty line.

INT32 Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a clean or dirty line. Used only to set the distinct clean/dirty variables with a single option -- not to be used outside option processing.

INT32 Associativity

Only for MHD_CACHE: 1 for direct mapped, etc

INT32 TLB_Entries

-1 if no TLB; otherwise, the number of entries

INT32 Page_Size

How many bytes are mapped by one TLB entry

INT32 TLB_Clean_Miss_Penalty INT32 TLB_Dirty_Miss_Penalty INT32 TLB_Miss_Penalty

How many cycles a TLB miss costs. The third is used only during option processing to allow setting the first two with a single option -- not to be used outside option processing.

double Typical_Outstanding

How many loads may be outstanding at the same time typically. 1 means no overlap. If a cache's max is 4, then 3 might be a reasonable number to use here. Don't use anything less than 1.

double Load_Op_Overlap_1 double Load_Op_Overlap_2

How much of a load may be overlapped by machine ops. 0.9 is reasonable for a non-blocking cache, and 0.0 for a blocking cache. If these are different, then _1 is how much of the first miss cycle is overlapped, and _2 is how much of the last miss cycle that's overlappable is overlapped. Thus 1.0 >= _1 >= _2 >= 0. It's linear inbetween those. Thus if the cache miss cycles are half the estimated machine cycles m, then the actual cycles lost to cache miss would have to be (1/2 m) [(3/4)(1-_1) + (1/4)(1-_2)]

INT32 Pct_Excess_Writes_Nonhidable

If a dirty line costs more than a clean line, let's call the additional cost the excess. E.g. clean miss=10 cycles, dirty miss=18 cycles. Excess 8 cycles. If this parameter is 0, then all 8 additional cycles are treated just as the first 10: it can be hidden. Realistically, this is not the case. The processor has to wait longer, and might run out of work. Dirty misses also take up extra cache resource. If this value is greater than 0, the part that isn't treated has hidable is added directly to the cost (after division by Typical_Outstanding). A perfectly reasonable value is 100. Because of the extra resources used, one can even imagine a number >100. We actually allow up to 1000 in the options processing (for testing purposes), but that doesn't make sense.

BOOL Prefetch_Level

Prefetching desired at this memory level?

The following are used only for option processing, and should not be referenced elsewhere:

INT32 Miss_Penalty INT32 TLB_Miss_Penalty

Described above.

char * CS_string

Cache size in string form for option processing.

BOOL <various>_Set

The relevant option was set from the command line.

Several of the values are specified by the constructor, which from the cache size, line size, and associativity computes the effective cache size. Other field values are initialized to -1, NULL, or FALSE as appropriate. If any are inappropriately 0 or -1 after initialization, then the cache specification is invalid, and Valid() returns false.

MHD_LEVEL()

Initialize fields to -1, NULL, or FALSE.

MHD_LEVEL(MHD_TYPE, INT64 cs, INT32 ls, INT32 mp, INT32 assoc, INT32 tlb_entries, INT32 pagesz, INT32 tlb_mp)

Initialize fields and compute Effective_Cache_Size.

~MHD_LEVEL() void operator = (const MHD_LEVEL&); void Print(FILE* f) const;

destruct, assign, print

void Merge_Options(const MHD_LEVEL& o)

Alter specification by merging in defined values from o.

BOOL Valid() const;

Returns true if and only if all cache fields >= 1 and type not MHD_TYPE_NONE.

BOOL TLB_Valid() const;

TLB_Entries, Page_Size and TLB_Miss_Penalty all >=1 and Valid()

MHD_MAX_LEVELS

The maximum number of memory hierarcy levels we will model. Currently, 4. If this changes, change lnodriver.c as well.

MHD

The memory hierarcy description, which contains MHD_LEVEL for each level of the memory hierarcy to be modeled by the compiler, and other system information useful for cache modelling.

BOOL Non_Blocking_Loads

TRUE if processor continues after a cache miss.

INT32 Loop_Overhead_Base INT32 Loop_Overhead_Memref

The loop overhead, in processor cycles, is Loop_Overhead_Base + memrefs*Loop_Overhead_Memref where memrefs is the number of non-cse'd memrefs in the inner loop. For example, do i a(i,j), a(i,j), a(i+1,j) all are cse'd into one memref in the inner loop, but do i a(i,j) a(i,j+1) have memref=2. The base is lower for T5 than for others because pipeline startup/winddown should be less when the fp pipelines are shorter. Also, multiple int/cycle. The memref is the cost to load an address into a register.

INT TLB_Trustworthiness

0 to ignore tlb, 100 to fully trust it. Even if it's fully trusted, when there is no blocking and a low cache miss rate, we still reduce the TLB penalty if TLB_NoBlocking_Model is set.

BOOL TLB_NoBlocking_Model

If we are not blocking and there is a low cache miss rate, should we back off on the TLB penalty? If true it favors not blocking. It's there because the TLB model is inaccurate and especially can overestimate the TLB miss rate when we are not blocking.

MHD_LEVEL L[MHD_MAX_LEVELS]

E.g. L[0] might be the primary cache.

INT First()

The first valid level. Returns -1 if none.

INT Next(INT l)

The valid level after l. Returns -1 when done.

void Merge_Options(const MHD& o);

Alter specification by merging in defined values from o.

void Initialize()

Use information about the target to set default values for these cache parameters.

void Print(FILE*) const MHD() ~MHD()

Exported Variables:

MHD Mhd;

The memory hierarchy description of we are compiling for

MHD Mhd_Options;

User specified options


Enumeration Type Documentation

enum MHD_TYPE

Description:

Unique prefix

MHD_ (meaning "memory hierarchy description")

MHD_TYPE

MHD_TYPE_NONE (not modelling anything) MHD_TYPE_CACHE (cache: misses to further out cache or main memory) MHD_TYPE_MEM (main memory: misses to a disk)

MHD_LEVEL

The parameters that describe this level of the memory hierarchy.

MHD_TYPE Type

Are we modelling cache, main memory, or nothing

INT64 Size

Number of bytes the cache or main memory holds.

INT64 Effective_Size

How many bytes the memory hierarcy level can effectively hold

INT32 Line_Size

Number of bytes in a cache line or page.

INT32 Clean_Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a clean line.

INT32 Dirty_Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a dirty line.

INT32 Miss_Penalty

Processor cycles to next level of memory hierarcy to replace a clean or dirty line. Used only to set the distinct clean/dirty variables with a single option -- not to be used outside option processing.

INT32 Associativity

Only for MHD_CACHE: 1 for direct mapped, etc

INT32 TLB_Entries

-1 if no TLB; otherwise, the number of entries

INT32 Page_Size

How many bytes are mapped by one TLB entry

INT32 TLB_Clean_Miss_Penalty INT32 TLB_Dirty_Miss_Penalty INT32 TLB_Miss_Penalty

How many cycles a TLB miss costs. The third is used only during option processing to allow setting the first two with a single option -- not to be used outside option processing.

double Typical_Outstanding

How many loads may be outstanding at the same time typically. 1 means no overlap. If a cache's max is 4, then 3 might be a reasonable number to use here. Don't use anything less than 1.

double Load_Op_Overlap_1 double Load_Op_Overlap_2

How much of a load may be overlapped by machine ops. 0.9 is reasonable for a non-blocking cache, and 0.0 for a blocking cache. If these are different, then _1 is how much of the first miss cycle is overlapped, and _2 is how much of the last miss cycle that's overlappable is overlapped. Thus 1.0 >= _1 >= _2 >= 0. It's linear inbetween those. Thus if the cache miss cycles are half the estimated machine cycles m, then the actual cycles lost to cache miss would have to be (1/2 m) [(3/4)(1-_1) + (1/4)(1-_2)]

INT32 Pct_Excess_Writes_Nonhidable

If a dirty line costs more than a clean line, let's call the additional cost the excess. E.g. clean miss=10 cycles, dirty miss=18 cycles. Excess 8 cycles. If this parameter is 0, then all 8 additional cycles are treated just as the first 10: it can be hidden. Realistically, this is not the case. The processor has to wait longer, and might run out of work. Dirty misses also take up extra cache resource. If this value is greater than 0, the part that isn't treated has hidable is added directly to the cost (after division by Typical_Outstanding). A perfectly reasonable value is 100. Because of the extra resources used, one can even imagine a number >100. We actually allow up to 1000 in the options processing (for testing purposes), but that doesn't make sense.

BOOL Prefetch_Level

Prefetching desired at this memory level?

The following are used only for option processing, and should not be referenced elsewhere:

INT32 Miss_Penalty INT32 TLB_Miss_Penalty

Described above.

char * CS_string

Cache size in string form for option processing.

BOOL <various>_Set

The relevant option was set from the command line.

Several of the values are specified by the constructor, which from the cache size, line size, and associativity computes the effective cache size. Other field values are initialized to -1, NULL, or FALSE as appropriate. If any are inappropriately 0 or -1 after initialization, then the cache specification is invalid, and Valid() returns false.

MHD_LEVEL()

Initialize fields to -1, NULL, or FALSE.

MHD_LEVEL(MHD_TYPE, INT64 cs, INT32 ls, INT32 mp, INT32 assoc, INT32 tlb_entries, INT32 pagesz, INT32 tlb_mp)

Initialize fields and compute Effective_Cache_Size.

~MHD_LEVEL() void operator = (const MHD_LEVEL&); void Print(FILE* f) const;

destruct, assign, print

void Merge_Options(const MHD_LEVEL& o)

Alter specification by merging in defined values from o.

BOOL Valid() const;

Returns true if and only if all cache fields >= 1 and type not MHD_TYPE_NONE.

BOOL TLB_Valid() const;

TLB_Entries, Page_Size and TLB_Miss_Penalty all >=1 and Valid()

MHD_MAX_LEVELS

The maximum number of memory hierarcy levels we will model. Currently, 4. If this changes, change lnodriver.c as well.

MHD

The memory hierarcy description, which contains MHD_LEVEL for each level of the memory hierarcy to be modeled by the compiler, and other system information useful for cache modelling.

BOOL Non_Blocking_Loads

TRUE if processor continues after a cache miss.

INT32 Loop_Overhead_Base INT32 Loop_Overhead_Memref

The loop overhead, in processor cycles, is Loop_Overhead_Base + memrefs*Loop_Overhead_Memref where memrefs is the number of non-cse'd memrefs in the inner loop. For example, do i a(i,j), a(i,j), a(i+1,j) all are cse'd into one memref in the inner loop, but do i a(i,j) a(i,j+1) have memref=2. The base is lower for T5 than for others because pipeline startup/winddown should be less when the fp pipelines are shorter. Also, multiple int/cycle. The memref is the cost to load an address into a register.

INT TLB_Trustworthiness

0 to ignore tlb, 100 to fully trust it. Even if it's fully trusted, when there is no blocking and a low cache miss rate, we still reduce the TLB penalty if TLB_NoBlocking_Model is set.

BOOL TLB_NoBlocking_Model

If we are not blocking and there is a low cache miss rate, should we back off on the TLB penalty? If true it favors not blocking. It's there because the TLB model is inaccurate and especially can overestimate the TLB miss rate when we are not blocking.

MHD_LEVEL L[MHD_MAX_LEVELS]

E.g. L[0] might be the primary cache.

INT First()

The first valid level. Returns -1 if none.

INT Next(INT l)

The valid level after l. Returns -1 when done.

void Merge_Options(const MHD& o);

Alter specification by merging in defined values from o.

void Initialize()

Use information about the target to set default values for these cache parameters.

void Print(FILE*) const MHD() ~MHD()

Exported Variables:

MHD Mhd;

The memory hierarchy description of we are compiling for

MHD Mhd_Options;

User specified options

Enumerator:
MHD_TYPE_NONE 
MHD_TYPE_CACHE 
MHD_TYPE_MEM 

Definition at line 317 of file config_cache.h.


Variable Documentation

Definition at line 250 of file config_cache.cxx.

Definition at line 251 of file config_cache.cxx.

Referenced by LNO_Configure(), LNO_Init_Config(), and LNO_Push_Config().

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines