TY - JOUR
T1 - Branch target buffer design for embedded processors
AU - Levison, Nadav
AU - Weiss, Shlomo
PY - 2010/10
Y1 - 2010/10
N2 - The demand for embedded application processors that support multi-tasking operating system and can execute complex applications bring them closer to general purpose processors. These strong processors have a limited power source because they are usually found in portable devices such as smartphones and other PDAs, and are powered by batteries. The Branch Target Buffer (BTB), which is commonly used in general purpose processors, is becoming prevalent in high-end embedded processors in order to support long pipelines and mitigate high miss penalties. However, the BTB is a major power consumer because it is a large SRAM structure that is accessed almost every cycle. We propose two BTB designs that fit the tight power budgets of embedded processors. In the first design, the power consumption of a single BTB access is reduced by reading only the lower part of the predicted target address bits. This design has power savings of up to 25% dynamic power, with effectively no performance degradation. In the second design, we avoid redundant BTB accesses to the same set by using a small buffer that holds the most recently accessed set. This design results in 75% dynamic power savings at the cost of up to 0.64% system slowdown in a 2-way BTB, and 80% dynamic power savings at the cost of up to 0.58% system slowdown in a 4-way BTB.
AB - The demand for embedded application processors that support multi-tasking operating system and can execute complex applications bring them closer to general purpose processors. These strong processors have a limited power source because they are usually found in portable devices such as smartphones and other PDAs, and are powered by batteries. The Branch Target Buffer (BTB), which is commonly used in general purpose processors, is becoming prevalent in high-end embedded processors in order to support long pipelines and mitigate high miss penalties. However, the BTB is a major power consumer because it is a large SRAM structure that is accessed almost every cycle. We propose two BTB designs that fit the tight power budgets of embedded processors. In the first design, the power consumption of a single BTB access is reduced by reading only the lower part of the predicted target address bits. This design has power savings of up to 25% dynamic power, with effectively no performance degradation. In the second design, we avoid redundant BTB accesses to the same set by using a small buffer that holds the most recently accessed set. This design results in 75% dynamic power savings at the cost of up to 0.64% system slowdown in a 2-way BTB, and 80% dynamic power savings at the cost of up to 0.58% system slowdown in a 4-way BTB.
KW - Branch address prediction
KW - Branch target buffer
KW - Embedded processors
KW - Portable devices
KW - Power consumption
UR - http://www.scopus.com/inward/record.url?scp=77954196777&partnerID=8YFLogxK
U2 - 10.1016/j.micpro.2010.04.005
DO - 10.1016/j.micpro.2010.04.005
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:77954196777
SN - 0141-9331
VL - 34
SP - 215
EP - 227
JO - Microprocessors and Microsystems
JF - Microprocessors and Microsystems
IS - 6
ER -