TY - GEN
T1 - Utilizing the IOMMU Scalably
AU - Peleg, Omer
AU - Morrison, Adam
AU - Serebrin, Benjamin
AU - Tsafrir, Dan
N1 - Publisher Copyright:
© 2015 USENIX Annual Technical Conference.
PY - 2015
Y1 - 2015
N2 - IOMMUs provided by modern hardware allow the OS to enforce memory protection controls on the DMA operations of its I/O devices. An IOMMU translation management design must scalably handle frequent concurrent updates of IOMMU translations made by multiple cores, which occur in high throughput I/O workloads such as multi-Gb/s networking. Today, however, OSes experience performance meltdowns when using the IOMMU in such workloads. This paper explores scalable IOMMU management designs and addresses the two main bottlenecks we find in current OSes: (1) assignment of I/O virtual addresses (IOVAs), and (2) management of the IOMMU's TLB. We propose three approaches for scalable IOVA assignment: (1) dynamic identity mappings, which eschew IOVA allocation altogether, (2) allocating IOVAs using the kernel's kmalloc, and (3) per-core caching of IOVAs allocated by a globally-locked IOVA allocator. We further describe a scalable IOMMU TLB management scheme that is compatible with all these approaches. Evaluation of our designs under Linux shows that (1) they achieve 88.5%-100% of the performance obtained without an IOMMU, (2) they achieve similar latency to that obtained without an IOMMU, (3) scalable IOVA allocation and dynamic identity mappings perform comparably, and (4) kmalloc provides a simple solution with high performance, but can suffer from unbounded page table blowup.
AB - IOMMUs provided by modern hardware allow the OS to enforce memory protection controls on the DMA operations of its I/O devices. An IOMMU translation management design must scalably handle frequent concurrent updates of IOMMU translations made by multiple cores, which occur in high throughput I/O workloads such as multi-Gb/s networking. Today, however, OSes experience performance meltdowns when using the IOMMU in such workloads. This paper explores scalable IOMMU management designs and addresses the two main bottlenecks we find in current OSes: (1) assignment of I/O virtual addresses (IOVAs), and (2) management of the IOMMU's TLB. We propose three approaches for scalable IOVA assignment: (1) dynamic identity mappings, which eschew IOVA allocation altogether, (2) allocating IOVAs using the kernel's kmalloc, and (3) per-core caching of IOVAs allocated by a globally-locked IOVA allocator. We further describe a scalable IOMMU TLB management scheme that is compatible with all these approaches. Evaluation of our designs under Linux shows that (1) they achieve 88.5%-100% of the performance obtained without an IOMMU, (2) they achieve similar latency to that obtained without an IOMMU, (3) scalable IOVA allocation and dynamic identity mappings perform comparably, and (4) kmalloc provides a simple solution with high performance, but can suffer from unbounded page table blowup.
UR - http://www.scopus.com/inward/record.url?scp=84975223139&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84975223139
T3 - Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015
SP - 549
EP - 562
BT - Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015
PB - USENIX Association
T2 - 2015 USENIX Annual Technical Conference, USENIX ATC 2015
Y2 - 8 July 2015 through 10 July 2015
ER -