Introduction
In the Pentium manuals, there are at least 9 references to 4MB pages. The Pentium Family User's Manual, Volume 1 (P/N 241428) mentions 4MB pages in sections 2.0, 3.7.2, and 3.7.4. Volume 3 refers to 4MB pages in sections 10.1.3, 11.3.3, 11.3.4, 16.5.3, 23.2.10.2, and 23.2.18.1. The Intel 860 XP processor documentation claims the i860 XP is page-level compatible with the Intel386, Intel486, and Pentium processors. This compatibility is noteworthy, as the i860 XP also supports 4MB pages, and its documentation provides a complete description of the 4MB paging mechanism(1). All that's needed to obtain an Appendix H description of 4MB pages, are a few references from the Pentium manuals, and the description of 4MB pages from the i860 XP manual.
Making the jump to 4MB pages
With an understanding of the 4KB paging mechanism, it's not difficult to deduce the 4MB paging mechanism. Recall that each page directory entry controls 4MB of memory. Now imagine how Figure 111 would change if the page table lookup were eliminated. The page frame index would increase from 12-bits to 22-bits, thus allowing direct control of a 4MB page size. The 20-bit pointer in the page directory, would be reduced to a 10-bit pointer, pointing directly to the 4MB page frame of memory. With the page table lookup eliminated, the page directory points directly to a 4MB page frame. This describes how 4MB pages are implemented in the i860 XP(1). But the question remains: are i860 XP 4MB pages compatible with Pentium 4MB pages? To answer that question, we need to compare the i860 and Pentium manuals.
The Pentium manual, volume 3, describes that CR4.PSE enables page-size extensions and 4MB pages but refers the reader to Appendix H for more information(4,5). Later in the Pentium manual, Intel shows that bit-7 of the page directory entry is the Page Size (PS) bit(3). Without CR4.PSE=1, the Pentium will always use Intel486-compatible (4KB) paging, regardless of the setting of the PDE.PS bit. Similarly, when CR4.PSE=1, and PDE.PS=0, Pentium still uses 4KB pages. But when CR4.PSE=1, and PDE.PS=1, Pentium uses an i860 XP-compatible 4MB page translation mechanism.
The linear address for a 4MB pages is converted to a physical address in much the same manner as 4KB pages. In this case however, the access to the page table is omitted. The high-order 10-bits form an index into the page directory. The page directory no longer contains a 20-bit pointer to a page table, but instead contains a 10-bit pointer to the 4MB page frame of memory. This convention mandates that all 4MB pages reside on 4MB boundaries. The 10-bit pointer in the page directory, is then combined with the low-order 22 bits of the linear address to form the 32-bit physical address.
Figure 1 shows a pictorial description of the 4MB and 4KB paging translation mechanism. Given all of the official documented references to 4MB pages in the Pentium manuals, all one needs to complete their understanding of 4MB pages is to study and understand this picture. Ironically, the 1993 edition of the Pentium manual, volume 3 contained a virtually identical picture(6). Intel obviously recognized the significance of this pictorial representation of 4MB pages, and substantially modified it in subsequent editions of their Pentium manual to remove the visual representation of the 4MB paging mechanism.
Figure 1-- Page Translation for 4MB and 4KB Page Sizes

Side-effects and caveats of 4MB pages.
(Their existence is worth mentioning here. However the details will be reserved for the magazine article.)
* Page fault error codes
* TLB Translation
* TLB Invalidation
Testing our hypothesis
After formulating our understanding of 4MB paging, it should be quite straightforward to write characterization code which would confirm our hypothesis. To detect whether or not 4MB pages are implemented in Pentium as they are in the i860 XP, we could do the following:
* Write the software assuming 4MB compatibility with the i860 XP.
* Enable paging.
* Before enabling 4MB paging, modify the second PDE (PDE which controls memory from 4MB-8MB) to point to the 0MB-4MB page frame, and mark it as a 4MB page. Install a PTE in the first entry pointed to by the modified PDE. This PTE should point back to the first page of memory at 4MB (which contains a signature of some sort).
* Read from the signature in memory. If 4MB paging works as expected, instead of getting the signature, you will retrieve the PTE we installed during the previous step. If 4MB paging does not work as expected, all is well, because the PTE is correctly formed, and you will retrieve your memory signature.
The key to this technique is to read from one location in memory if 4MB pages work, but another location if they don't (so we don't page fault). This approach is demonstrated in the source code listing, 4MPAGES.ASM to show that 4MB pages work as described herein.
Now that we have demonstrated that 4MB pages work as expected, we could write more characterization code to prove other behavioral characteristics of enabling CR4.PSE. Distributed with this article is source code to demonstrate the page faulting behavior of PSE. Another program is included to detect the TLB size and associativity. Finally, another program will demonstrate that writing any values to CR4.PSE will not invalidate the TLB.
Conclusion
Since the Pentium was introduced, Intel has withheld the architectural details of 4MB pages. Only by signing a 15-year NDA would you be given access to the documents that describe their implementation and use. The earliest Pentium manuals documented enough details of 4MB pages to allow anyone to reverse-engineer the details. As newer Pentium manuals were introduced, Intel removed the most expository details. Unknown to most people outside of Intel, the entire implementation details are documented in the i860 XP data sheet which is readily available -- no NDA required.
No comments:
Post a Comment