TY - JOUR
T1 - Thrifty BTB
T2 - A comprehensive solution for dynamic power reduction in branch target buffers
AU - Kahn, Roger
AU - Weiss, Shlomo
PY - 2008/11
Y1 - 2008/11
N2 - We propose Thrifty BTB, a mechanism to reduce the dynamic power dissipated by the BTB. We studied two mechanisms that reduce dynamic power dissipation. The first one is a serial-BTB configuration. The second mechanism is the filter-BTB, a combination of a low power counting Bloom filter placed in front of a conventional BTB. We also studied the effect of placing a small 32 entry direct-mapped BTB, functioning as a bypass, in parallel with the first two mechanisms. The filter-BTB reduces the number of lookups relative to a conventional BTB and the dynamic power dissipated. The serial-BTB variant only accesses the data array of the BTB upon a hit, therefore for most of the accesses the actual power dissipated is only what is dissipated by accessing the tag array. The bypass is used in parallel to either the filter- or the serial-BTB and reduces the performance cost by providing a low latency response in case of a hit. By integrating these mechanisms into a BTB design we achieve an average reduction of 51% in the dynamic power dissipation of the BTB. These benefits come at a small performance cost that is on average slightly less than 1.2%. The energy delay product was reduced by an average of 50%.
AB - We propose Thrifty BTB, a mechanism to reduce the dynamic power dissipated by the BTB. We studied two mechanisms that reduce dynamic power dissipation. The first one is a serial-BTB configuration. The second mechanism is the filter-BTB, a combination of a low power counting Bloom filter placed in front of a conventional BTB. We also studied the effect of placing a small 32 entry direct-mapped BTB, functioning as a bypass, in parallel with the first two mechanisms. The filter-BTB reduces the number of lookups relative to a conventional BTB and the dynamic power dissipated. The serial-BTB variant only accesses the data array of the BTB upon a hit, therefore for most of the accesses the actual power dissipated is only what is dissipated by accessing the tag array. The bypass is used in parallel to either the filter- or the serial-BTB and reduces the performance cost by providing a low latency response in case of a hit. By integrating these mechanisms into a BTB design we achieve an average reduction of 51% in the dynamic power dissipation of the BTB. These benefits come at a small performance cost that is on average slightly less than 1.2%. The energy delay product was reduced by an average of 50%.
KW - Branch prediction
KW - Branch target buffer
KW - Dynamic power
KW - Microarchitecture
UR - http://www.scopus.com/inward/record.url?scp=54549086764&partnerID=8YFLogxK
U2 - 10.1016/j.micpro.2008.05.004
DO - 10.1016/j.micpro.2008.05.004
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:54549086764
SN - 0141-9331
VL - 32
SP - 425
EP - 436
JO - Microprocessors and Microsystems
JF - Microprocessors and Microsystems
IS - 8
ER -