Basic concepts
Overall architecture
The Merom processor of the Core architecture is indeed powerful. In a number of tests, the T7200 with a frequency of 2GHz can beat the T2700 with a frequency of 2.33GHz is the best proof. But you have also noticed that although the mobile platform Merom has strong performance, it does not bring you much surprise. Although it is better than Yonah, but the range is not large, and in some test items, the lower frequency T7200 also lost to the T2700. Therefore, it is possible that the advantages of the Core micro-architecture on the mobile platform are not as outstanding as the desktop platform-an E6300 with the lowest frequency can also wipe out the high-frequency Pentium D. The reason is that Yonah itself is better than NetBurst failed. Moreover, the Core micro-architecture itself is improved from the Yonah micro-architecture, and it is reasonable that the results will not form too much contrast.
Core micro-architecture is a new generation of micro-architecture improved by Intel’s Israeli design team on the basis of Yonah micro-architecture. The most significant change lies in the enhancement in each key part. In order to improve the efficiency of internal data exchange between the two cores, a shared secondary cache design is adopted, and the two cores share up to 4MB of secondary cache. Its core adopts a shorter 14-level effective pipeline design, and each core has a built-in 32KB first-level instruction cache and 32KB first-level data cache, and data can be directly transferred between the first-level data caches of the two cores. Each core has built-in 4 groups of instruction decoding units, supporting micro-instruction fusion and macro-instruction fusion technology, each clock cycle can decode up to 5 X86 instructions, and has an improved branch prediction function. Each core has built-in 5 execution unit subsystems, and the execution efficiency is quite high. Added support for EM64T and SSE4 instruction sets. Because of the support for EM64T, it can have a larger memory addressing space, which makes up for Yonah's shortcomings. After the popularization of the new generation of memory-intensive users-Vista operating system, this advantage can make the Core micro-architecture have a longer life cycle . It also uses Intel’s latest five new technologies to improve performance and reduce power consumption, including: better power management; support for hardware virtualization technology and hardware antivirus; built-in digital temperature sensor; provide power report and temperature report Wait. In particular, the adoption of these energy-saving technologies is of great significance to mobile platforms.
In addition, Core supports 64-bit.
Based on Core architecture processors, facing different consumer groups, Core processors have a small division of labor, specifically for Conroe, which is used in desktops. Merom is used for notebooks and WoodCrest is used for servers. All three processors are based on the Core core architecture.
Intel processors, including Core series desktop, mobile, and Xeon processors, and even embedded processors, will all enter the 32-nanometer process one after another, gradually replacing the current 45-nanometer process. As CES is approaching, Intel has revealed that it will release a number of Core i3, i5 desktop and notebook processors at CES, including Arrandale for laptops and Clarkdale for desktops using 32nm process, emphasizing smaller size and Power consumption design. On December 23, 2009, Intel disclosed that the embedded Xeon processor that will be launched in the first quarter of 2010 will also use a new process. The 32-nanometer process that was put into production at the end of 2009, compared with the 45-nanometer process at the end of 2008, uses the second generation of high-k metal gate transistors and immersion lithography technology to strengthen the use of electronic control tubes inside the processor. , It is also 30% smaller than the 45nm process size, simplifying system design. According to Intel’s blueprint, in the first quarter of 2010, a 32-nanometer process will be launched for the embedded market, code-named Jasper Forest’s embedded Xeon processor, which has 30% to 70% higher performance per watt than the old process processor and supports PCI 2.0 and I/O virtualization capabilities. As for the server Xeon processor used by enterprises, with the introduction of the desktop processor Clarkdale in 2010, the entry-level Xeon 3000 processor, which is closely related to the high-end desktop market, will also enter the new 32-nanometer process in 2009.
As for the Xeon 5000 that adopted the Nehalem-EP architecture in 2009, although the Nehalem architecture is also adopted, the new 32-nanometer process will be adopted in the first half of 2010 and the Westmere-EP processor will be introduced. The original 6-core Xeon 7000 processor will also launch Nehalem-EX with up to 8 cores in the first half of 2010, and will also enter the new Westmere-EX in the second half of 2010.
Except for embedded systems, servers, laptops and desktops that have entered new manufacturing processes, only the low-power-designed Atom processor has not yet entered, and the 45-nanometer process is still used.
Compared with Intel entering the new process in 2010, AMD will start to enter the 32-nanometer process in 2011, and will adopt the new Bulldozer core architecture design, including Interlagos with performance levels of 12 to 16 cores. , And Valencia, which emphasizes energy efficiency with 6 to 8 cores.
The 8-core CPU is now impossible to correspond to the current motherboards, so it is impossible to make a big publicity. The cheapest 8-core CPU should be the CELL of SONY PS3, and the floating point performance of 8 cores is Core Duo N Many times, and 4 cores are not popular now. AMD INTEL will not rush to mass-produce their 8-core CPUs. It can be said that the current INTEL 4 cores just encapsulate two core cores in one core, between the two cores. Direct communication is not used. AMD has a true 4 core, but it is not selling well and cannot become the mainstream. To sum up, 5 years later, the 4-core can basically replace the current dual-core and become the mainstream, and 8-core or even 16-core CPUs will become high-end products at that time!
Brief description of development
1, X86
Although it is said that it is divided by processor architecture, there are currently four main terms in terms of the term itself. That is, IA-32, IA-64, x86-32, x86-64, but in fact they belong to two categories, IA-32, x86-32 belong to x86, namely Intel’s 32-bit x86 architecture, x86-64 is AMD’s The new architecture used in its latest Athlon 64 processor series, but the processor infrastructure is still IA-32 (because Intel’s x86 architecture has not applied for patent protection, most processor manufacturers are in order to maintain the mainstream The processor is compatible, all have to adopt this x86 architecture), but some extensions have been made on the basis of this architecture to support the application of 64-bit programs and further improve the computing performance of the processor. Compared with Intel’s 64-bit server processor products, Itanium and Itanium 2 series processor products, the biggest advantage of x86-64 is that it can be fully compatible with the previous 32-bit x86 architecture applications to protect users’ previous investments; while Intel’s Itanium and Itanium 2 series processors need additional software or hardware to achieve compatibility with previous 32-bit programs.
Because of this, we will see things like IA-32, x86-32, and x86-64 in the future. To be clear, they are all of the same type and belong to the x86 architecture. For example, Intel's 32-bit server Xeon (Xeon) processor series, AMD's full series, and VIA's full series of processor products all belong to the x86 architecture.
2, IA64
The IA-64 architecture is Intel’s purpose to comprehensively improve the computing performance of the previous IA-32-bit processors. It is a 64-bit CPU jointly developed by Intel and Hp for 6 years. The architecture is a brand-new processor architecture developed specifically for the server market. It abandons the previous x86 architecture and believes that it seriously hinders the performance of the processor. Its initial application was Intel's Itanium (Itanium) series of server processors, and the latest Itanium 2 series of processors in 2009 also adopted this architecture. Because it cannot solve the compatibility with previous 32-bit applications, the application is subject to greater restrictions. Although Intel has adopted various soft and hard methods to make up for this shortcoming, with the comprehensive development of AMD Operon processors With investment, the prospects for these two processors of Intel's IA-64 architecture are not optimistic.
3, RISC
In addition to the two types of IA architecture server processors introduced above, there is also a mainstream processor architecture, which can also be called "RISC" ( In fact, it is a type divided according to the execution mode of processor instructions). It is still IBM, SUN and HP that adopt this architecture. However, in recent years, due to the fact that this processor architecture standard has not been fully unified, and the development and application of processors have been very slow, the vast majority of the mid-to-high-end server market originally occupied by the IA architecture has been declining. At present, even these server vendors have begun to give up on their own, switch to IA, and launch more and more IA-based servers in order to survive.
Currently, the main server processors using this architecture are IBM's Power4, Compaq Alpha213 64, HP PA-8X00, Sun's UltraSPARC III, SGI's MIPS 64 20Kc and so on.
4, Intel
Introduction
Intel common server CPU classification. The development of processor technology is really changing with each passing day. The previous generation of products has not been distinguished by everyone, and will soon be replaced by the next generation of products. Here is a division for everyone based on some personal understanding.
One, Xeon (Xeon)
Currently, all dual-socket and quad-socket servers of Intel IA architecture are all using Xeon (Xeon) CPU, which is based on X86 architecture A CPU dedicated to servers. Early processor names are represented by numbers and end with "86", including Intel 8086, 80186, 80286, 80386, 80486, 80586, Pentium series, etc., so its architecture is called "x86", so far all Xeon, including dual-core and quad-core, are all products based on the X86 architecture.
Second, Itanium (Itanium)
Itanium processor is also often called IA-64-bit processor, which is a pure 64-bit processor products have 64-bit addressing capabilities and 64-bit wide registers. A series of features, such as EPIC instructions, are designed for the most demanding computing and enterprise-level requirements. For the most demanding companies or applications that require high-performance computing support (including electronic transaction security processing, super-large databases, computer-assisted mechanical engines, cutting-edge scientific computing, etc.), Itanium processors can well meet user requirements .
Intel server processor list
series | Xeon3000 | Xeon3200 | Xeon3300 | Xeon5000 | Xeon5100 | Xeon5300 | Xeon5200 | Xeon5400 | Xeon7100 | Xeon7300 | Itanium9000 | Itanium9100 |
CPU code | ? | ? | ? | Dempsey td> | Woodcrest | Clovertown | Wolfdale-DP | Harpertown | Tulsa | Tigerton | Montecito | Montvale |
Manufacturing process | 65nm | 65nm | 45nm | 65nm | 65nm | 65nm | 45nm | 45nm | 65nm | 65nm | 90nm | 90nm |
Command set | X86 | X86 | X86 td> | X86 | X86 | X86 | X86 | X86 td> | X86 | X86 | EPIC | EPIC |
Core micro-architecture | √ | √ | √ | × | √ | √ | √ | √ | × | √ | × | × |
Maximum number of processors in the system | 1 | 1 | 1 | < td width="142">2 | 2 | 2 | 2 | < td width="142">32 | 512 | 512 | ||
Main frequency ( GHz) | 1.86/2.13/ 2.33/2.4/ 2.66/3.0 | 2.13/2.4/ 2.66 | 2.5/2.83/ 3.0 | 2.67/3.0/ 3.2/3.73 | < td width="142">1.6/1.86/ 2.0/2.33/ 2.66/3.0 | 1.86/3.4/ 3.33 | 2.0/2.33/ 2.5/2.66/ 2.8/2.83/ 3.0/3.16/ 3.2 | 2.5/2.6/ 3.0/3.16/ 3.2/3.33/ 3.4/3.5 | 1.6/1.86/ 2.13/2.4/ 2.93 p> | 1.4/1.42/ 1.6 | 1.42/ 1.6/ 1.66 | |
Secondary cache (MB) | 2/4 | 8 | 6/12 | 4 | 4 | 8 | 6 | 12 | 2*1 | 8 p> | ? | ? |
Three-level cache (MB Technorati tag: processor, CPU ) | 54234 | 56456 | 564646 td> | 7 68678 | 978978 | 978978 | < td width="142">980898 | 4/8/16 | 8797 | 6/8/12/ 18/24 | 8/12/18/24 | |
Front side bus (MHZ) | 1066/ 1333 | 1066 | 1333 | 667/ 1066 | 1066/ 1333 | 1066/ 1333 | 1066/ 1333/ 1600 | 1333/ 1600 | < p>667/800 | 1066 | 400/533 | 400/533/667 |
Power consumption ( W) | 65 | 95 | 95 | 95/130 | 40/65/80 | 50/80/120 | < p>65/80 | 80/120/150 | 95/ 150 | 80/130 | 75/104 < /td> | 75/104 |
Dual core td> | √ | ? | ? | √ | √ td> | ? | √ | ? | √ | ? td> | √ | √ |
? | √ | √ | ? | ? | √ | ? | √ | ? | √ | ? | ? | |
Hyperthreading< /p> | × | × | × | √ | ×< /p> | × | × | × | √ | ×< /p> | √ | √ |
64-bit operations | EM64T | EM64T | EM64T | EM64T | p>EM64T | EM64T | EM64T | EM64T | EM64T | p>EM64T | Pure 64-bit | Pure 64-bit |
Three, processor comments
1, first look at single-channel processing Processors, including Xeon3000, 3200, 3300 series, of which 3000 and 3200 series single-socket processors all adopt the Core micro-architecture, performance and power consumption are very ideal, you can choose the main frequency, dual-core or quad-core according to the application. The other 3300 series uses the latest 45nm manufacturing process, using an enhanced Core micro-architecture, with stronger performance and lower power consumption.
2, dual-socket processor, Xeon5000 series has high power consumption and poor performance, and it has basically disappeared; 5100, 5300 series began to use Core micro-architecture, performance and power consumption are very good, it can be said to be Intel A super-successful processor product. Compared with the previous generation of processors, the performance has been improved several times, and the power consumption has been reduced. For a long time, competitors have no products that can compete with it. The newly launched 5200 and 5400 series use 45nm manufacturing process and an enhanced Core micro-architecture. Compared with the 5100 and 5300 series, the performance is increased by an average of 20%, and the power consumption is reduced by nearly 38%. What's more , The price is still very low, it is simply the best choice for server CPU at this stage.
3, multi-channel Xeon processors, on Intel’s official list, Xeon7100, 7300 processors are marked to support 32 processors in a single system, but in the domestic market, you can often see There are only 4-way Xeon servers. The Xeon 7100 processor, because the advanced Core micro-architecture was not adopted at the time, the 4 7100 series CPUs combined can not run as fast as the 2 5300 series dual-socket processors, and the price is still very high, so it is not recommended. Use, and Xeon7100 will soon disappear from the market. The new Xeon 7300 series is a very good multi-channel Xeon CPU. It uses a Core micro-architecture with 4 cores per CPU. If 4 CPUs are combined together with large-capacity memory, the performance will be very strong and sufficient High performance, large data volume computing requirements.
4, Itanium processor, in fact, the main competitors of Itanium processors are high-end minicomputer CPUs of brands such as IBM and SUN. If you have been using high-end minicomputers, such as installing IBM Power CPU Yes, then I think it is necessary for you to learn about Itanium and this new generation of open high-end CPU products. Perhaps you will find that the original high stability and high performance do not necessarily have to be high cost. In addition, in some scientific calculations, Itanium will also bring you unexpected results.
5, CORE
In early March 2006, Intel held the 2006 Spring IDF Conference (Intel Developer Forum) in San Francisco, USA. At this IDF conference, there was a focus of much attention: Intel announced the Core micro-architecture that will be used in next-generation processors. This also makes the 2009 IDF conference the most exciting one in recent years. In the opening keynote speech of the IDF conference in the fall of 2008, Intel's chief executive Paul Otellini once pointed out that the focus of future processor technology development will be "Performance per Watt." The theme of this IDF conference is more clear: Power-Optimized Platforms-closely related to the Core micro-architecture. According to Intel, processors with the new Core microarchitecture will make a huge leap in integer performance and commercial computing, and will definitely surpass competitors AMD's products. What's even more wonderful is that the Core microarchitecture with such powerful performance will significantly reduce power consumption than its predecessor, which perfectly reflects the theme of this IDF conference.
The Core microarchitecture was designed by Intel’s R&D team in Haifa, Israel. As early as 2003, the Israeli team was famous for designing the Banias processor with high performance and low power consumption. The Core microarchitecture is also their latest masterpiece after the Yonah microarchitecture. The Core micro-architecture appeared in Intel’s plans for a long time. As early as the summer of 2003, Intel had vaguely mentioned that it was originally planned to be adopted by the third-generation Napa platform of the Centrino platform and the fourth-generation Santa Rosa platform. processor. Unexpectedly, due to the failure of the NetBurst micro-architecture, the Core micro-architecture was changed by Intel and brought to the forefront. It was given the historical mission of replacing the NetBurst micro-architecture and unifying the desktop, mobile and server platforms.
As Intel’s new flagship, the Core microarchitecture has dual-core, 64-bit instruction set, 4-issue superscalar architecture and out-of-order execution mechanism. It is produced using a 65nm manufacturing process and supports 36-bit physical search. Address and 48bit virtual memory addressing, support all Intel extended instruction sets. Each core of the Core micro-architecture has a 32KB first-level instruction cache, a 32KB dual-port first-level data cache, and then the two cores share a 4MB shared second-level cache. The highest frequency released by the Core microarchitecture in 2009 will be Conroe XE's 3.33GHz. Each product has its own highest TDP: Merom up to 35W, Conroe up to 65W, and Woodcrest up to 80W. In addition, low-power versions can also be provided for different customer requirements. For example, the low-voltage version of Woodcrest will be positioned in the blade system, reducing the frequency and other methods to make the TDP as low as 40W.
Intel claims that the Core microarchitecture has a 14-level "effective" pipeline. From the same design team as Banias, the Core microarchitecture has only 14-stage integer pipelines, which is not surprising. But what exactly is a 14-level "effective" pipeline?
In the past few years, several concepts related to pipeline series have often been confused. Let us first clarify that the "number" and "series" of the pipeline are completely different concepts. A series of functional units that can completely execute various instructions form a "one" pipeline. Regarding the number of pipeline stages, it can be simply understood as follows: In the traditional sense, the functional units contained in a pipeline can generally be divided into multiple parts, and it can be divided into several parts. "of. Then let us understand the definition of "effective assembly line", which is also easy to misunderstand in the past. In short, the so-called effective pipeline refers to the number of pipeline stages that need to be re-executed when a branch prediction error occurs. For processors using NetBurst microarchitecture, the effective pipeline stages of Willamette, Northwood, and Prescott cores are 20, 20, and 31, respectively, while the original P6 microarchitecture processor is 10 stages.
However, for modern X86 processors that generally use out-of-order execution, the number of effective pipeline stages does not represent the true number of pipeline stages. The processor of the NetBurst micro-architecture is just the trace establishment process of the Trace Cache, and there are at least 10 stages; the complete pipeline stage of the P6 micro-architecture should be 12 to 15 (the effective pipeline of 10 stages plus the Retire action after the instruction is executed) Possible Reorder Buffer delay). As the working methods of out-of-order execution engines become more and more complex, the concept of X86 processor pipeline stages is also increasingly blurred. In other words, the number of pipeline stages in the true sense of the Core microarchitecture will not be only 14.
The comparison between the 14-stage effective pipeline of the Core micro-architecture and the 31-stage effective pipeline of the Prescott core is only for reference. Those who assert that the Core microarchitecture can only reach very low frequencies based on the comparison of this number are not convincing enough. The existence of Conroe XE 3.33GHz processor has surprised many users who believe this statement. In fact, some players have claimed that Conroe processors can reach frequencies above 4GHz under air-cooling conditions. Let us wait and see what height the frequency of the Core micro-architecture can reach.
The difference between core and conroe
We transliterate Core to Core, which is the microarchitecture that Intel’s next-generation processor products will adopt uniformly, and Conroe It's just the code name for Intel's next-generation desktop platform-level products based on the Core micro-architecture. In addition to the Conroe processor, the Core microarchitecture also includes a mobile platform processor code-named Merom and a server platform processor code-named Woodcrest. Processors using Core will be named uniformly. Since the previous generation of processors using the Yonah microarchitecture was named Core Duo, in order to distinguish it from the previous generation of Intel dual-core processors, Intel’s next-generation desktop processor Conroe and the next-generation notebook processor Merom will be collectively called Core 2 Duo. In addition, Intel's top desktop processor is named Core 2 Extreme to distinguish it from mainstream processor products.
There are a total of 10 Conroe/Merom models released this time, of which 5 models starting with E and X are for desktop computers, and 4 models starting with T are for notebooks.
Intel’s initial release of Core micro-architecture processors includes E6000 desktop series and T7000, T5000 mobile series. E6000 series processors have an external frequency of 266MHz, a front-side bus frequency of 1066MHz, and 2MB (E6300, E6320, E6400) Or 4MB (E6600, E6550, E6700) secondary cache for high-performance markets; the FSB of the E4000 series to be launched later is relatively low, 200MHz, front-side bus 800MHz, positioning is lower than the E6000 series, the release time will be postponed to 2007 The first quarter of the year. In addition to the regular version of Conroe, Intel will also release the Conroe XE processor to replace the existing flagship product Pentium XE-the X6800.
Although the front-side bus of Conroe on the desktop platform is 1066MHz, this time the protagonist mobile version of the processor Merom front-side bus is 667MHz (the Merom processor was originally a processor on the next-generation mobile platform Santa Rosa Products, now have to put the Merom processor to the market before the Santa Rosa platform is launched, and can be smoothly implanted on the current Napa platform. In order to run on the Intel 945 chipset, its front side bus is suitable for the Intel 945 chipset , But still retains the 667MHz front-side bus design. In the future, the Merom processor on the Santa Rosa platform will have its front-side bus changed to 800MHz. This scenario is very similar to that of the 400MHz Dothan introduced that year to adapt to the Intel 855 chipset) . The second level cache is increased to 4MB (the low-end T5000 series is still 2MB), which means that more data waiting to be processed can be stored in the cache, reducing the bottleneck of data transmission between the processor and memory and peripheral devices, and improving the hit rate of instructions , Greatly improve the execution efficiency.
As the Yonah processor on the Napa platform is replaced with a Merom processor, this also means that Intel mobile processors have begun to enter the era of 64-bit dual-core technology, and Yonah will begin as a hero of the first battle of dual-core mobile processors. Retired in the second place
Book title
Basic information
Book title: Processor architecture
Author: Intel Asia Pacific Research and Development Co., Ltd.< /p>
Publisher: Shanghai Jiaotong University Press
Publishing time: January 1, 2011
ISBN: 9787313068699
Formation: 16 Open
Price: 29.50 yuan
Introduction
"Processor Architecture" has five chapters, from instruction system, CPU composition, CPU new technology, CPU example Introduce the technology and application development of processor architecture in detail. The theory of "Processor Architecture" is combined with practical examples, which is simple and easy to understand, suitable for reading and learning by the majority of computer majors and IT beginners.
Book Catalog
1 Introduction to Computer System
2 Instruction System
3 CPU Composition
4 CPU New Technology
5 CPU examples