Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration
Highlights
-
It should be noted this is different from inclusive tiering which strictly uses the performance tier as a cache for the capacity tier. Non-exclusive memory tiering allows a subset of pages on the performance tier to have shadow copies on the capacity tier.
-
This paper was decently put together and offers a unique way of dealing with tiered memory. It was difficult to find any major flaws with this paper. The authors were forthcoming about which workloads NOMAD works best at, and which workloads it may actually be detrimental at. I believe that's important for the credibility of this paper. Lastly, I"m surprised that non-exclusive tiered page management hasn't really been that explored. I would have giving this paper an accept.
Summary
Tiered memory systems that utilize CXL memory, persistent memory, and storage-class memory are present in modern computing, however they traditionally follow an exclusive approach: that is, pages may exist in fast or slow memory, but not both. This can result in performance degradation when fast memory is under pressure. In response, the authors propose a non-exclusive memory tiering strategy that features transactional page migration and page shadowing. This provides the added benefit of moving page migration from the critical path, making it asynchronous.
Key Contributions
-
The paper proposes transaction page migration (TPM), that enables page access during migration. This is in contrast to existing systems. After copying, it checks if the page has dirtied in the meantime. If it has, the page is discarded and retried at a later time. If it succeeds, the new page is mapped to the page table and the old page is unmapped, becoming a shadow copy of the new page.
-
NOMAD also has safeguards to prevent out-of-memory errors due to page shadowing. Under pressure, NOMAD prioritizes reclamation of shadow pages before evicting ordinary pages. Page shadowing exists to enable fast demotion without the overhead of copying.
-
NOMAD works asynchronously to minimize time pages are inaccessible, but it only works during certain cases. That is, NOMAD will abort a migration if a page is dirtied during copying. Such aborted migrations may be a significant source of overhead on certain workloads. Further, it will be disabled for memory mapped to multiple processes.
Strengths
-
The authors were honest and forthcoming in the main body of the paper for situations where NOMAD's method is detrimental and can lead to sub-optimal performance (particularly when when the WSS was comparable to or exceeded the performance tier capacity)
-
Evaluations against three real-world applications strengthened the paper's credibility versus just having micro-benchmarks against TPP. The real-world applications revealed some non-obvious insights too.
Weaknesses / Questions
-
Minor: I'm familiar with CXL devices from interfacing with a DMA-capable NIC/accelerator. It's hard to wrap my head around how their page migration strategy would even work in this case, especially for in-progress DMA transfers
-
Question: Is this orthogonal to cached pages in linux kernel?
-
Question: How is the performance if this when used with huge pages?
Related Work
-
Tiered Memory Systems
-
Page Management