The operating system must for security reasons zero-fill memory pages when passing them between applications. Similarly, operating systems often have to copy data between different buffers. For both of these operations, the cost of reading the data that is to be over-written can in many cases dominate performance.
NUMAchine minimizes the overhead for zeroing or copying data by allowing these operations to be done without loading the data that will be overwritten into the processor cache. To copy data between a source and target page, the operating system: (1) makes a single request to the affected memory module to invalidate any cached lines of the target page, mark the state as dirty, and set the routing mask (or processor mask) to the processor performing the copy, (2) creates the cache lines of the target page in the secondary cache by modifying the tag and state information of the secondary cache, and (3) copy data between the source and target page. Zero-filling pages is identical to copying pages, except for the final stage, where the created cache lines are instead zero-filled.