You are currently viewing The CPU – Part 2

The CPU – Part 2

In my previous post about CPUs, I explained the basics of their functionality. Because of all the technological progress, we make (AI, laptops, tablets, mobile phones), the CPU requirements are ever-increasing. Luckily both Moore’s Law and the creative tinkering of the CPU architects made it possible to keep up with the never-stopping technology speed train. 

In this post, I will discuss the different CPU enhancements throughout the years that were being tinkered by the CPU architects so that the CPU was able to keep up with technology.

Chip architects

Before I jump into the enhancements, I would first like to give more information about the CPU architects that are responsible for the design (and also its enhancements). There are two key players: Intel and AMD (Advanced Micro Devices). 

In 1981 IBM launched its IBM PC. The launch of this PC on the personal computer market was a great success and the specifications of the IBM PC became one of the most popular computer design standards in the world. Intel was awarded the contract for its CPU which resulted in the Intel 8088 CPU, build into the IBM PC system:

After the launch of this CPU, Intel never looked back and became one of the most popular chip manufacturers in the world, responsible for a big part of the development of the CPU architecture as it is in its current day.

AMD (Advanced Micro Devices) cloned the function of the Intel CPUs. I’m sure you think that Intel didn’t want to allow this but because of a cross-license agreement in 1976, Intel gave AMD the right to do so. The reason for this was that Intel could not keep up with the supply of the Intel 8088 and needed AMD to help supply enough CPUs to satisfy IBMs order requests. After 20 years (and many legal battles because Intel of course had regrets), Intel and AMD ended the licensing agreements in 1995. After the settlements, AMD chips were no longer compatible with sockets or motherboards that were produced for Intel CPUs. In other words: you can’t put an AMD chip in an Intel-compatible motherboard. This means that you need an AMD motherboard for an AMD chip and an Intel motherboard for an Intel chip.

Today, AMD does not clone the Intel chips anymore but it is a very strong competitor of Intel with state-of-the-art CPU architecture that is close to or even better than the Intel CPUs.

Both Intel and AMD present a new CPU design at least every 3 years to keep up with technology and to prevent lagging behind the competition.

AMD and Intel are not in the mobile CPU business. Mobile devices like the iPhone and iPad and also most Androids, use a CPU architecture that is developed by ARM Holdings (ARM). These processors use a much simpler and more energy-efficient design. They can’t match performance with Intel and AMD but lower production cost and long battery life make ARM-based processors perfect for mobile devices. Big manufacturers like Qualcomm license the design and then manufacture their own CPU versions.


In our present-day, Cybersecurity is more important than ever. Chip architects anticipate on this as well and they introduced the NX Bit technology in the CPU. This No-eXecute (NX) feature supports the CPU to protect the system against attacks by malware. It prevents that the malware code is accessible in the regions of memory. This is not waterproof: its protection will not protect you from every malware code in the world but it is nice to have an extra weapon against cybercrime.

If you want to check if you have NX or if the NX bit is enabled you can go into the BIOS when you boot up your system, then you go to the Advanced tab and there should be an entry close to Execute/Disable Bit or NX Bit.

Clock multipliers

In order to process a command, a CPU needs at least two charges of voltage (two clock cycles) to process a single command. Because programs are far more demanding than the programs in the past, the number of clock cycles used to operate a program properly is much higher now. Originally a CPU ran at the same speed as the EDB (External Data Bus) and Address bus. But the CPU was mainly occupied with the process of the commands and not the busses. Therefore, CPU architects sped up the internal operations of the CPU only and not of anything else. This way they were able to speed up the whole computing process because the CPU was not stuck at the speed of other hardware. This improvement had an enormous boost on the computation speed of the CPU.  

64-bit processing

Through the years the External Data Bus (EDB) increased from 8- to 16- to 32- to 64-bits wide. This means that 64 parallel wires are running between the EDB and the CPU. These wires can all use one character of code (0 or 1). One character is a “bit” so the 64 wires can contain a total of 64 bits next to each other. Because 8 bits is 1 byte, the 64 bits are 8 bytes. However, 64-bits is the common language that the computer industry uses. In addition to the EDB, the Address Bus also has 64 wires nowadays which means that it can also process 64 bits per line. The biggest benefit of moving from 32-bit to 64-bit is that computers can pass the threshold of 4GB of memory which was the limit at 32-bit. With moving to 64-bit however, the threshold is recalibrated to 16 exabytes (EB). An exabyte is 1.000.000 terabytes so the increase in available memory compared to 32-bit is huge!  

Parallel execution

Before parallel execution, only one command at a time could be processed. To process a command, the CPU needs to take four steps: 

  1. Fetching: getting the data from the EDB
  2. Decoding: finding out what command needs to be executed
  3. Executing: performing the calculation
  4. Writing: sending the data back to the EDB

Most of the time each step takes one clock cycle. In the past, the command went from station 1 to station 2 to station 3 and station 4. But while one station was performing a task, the other stations were idle. Parallel execution changed this by integrating the technique of pipelining into the process. With pipelining, the steps of more than one command are simultaneously executed in the same way as it is done in an assembly line. The conveyor belt in this line has more orders on it and when an order moves to the next step, the previous step gets a new order to process. This means that every step is processing orders all the time. 

It happens that a complex command requires more than one clock cycle to process. This forces the pipeline to stop because one station is occupied for a longer period. This hic-up is called a pipeline stall. Over the years, the architects found a solution for that as well. Instead of having one pipeline, they were able to integrate multiple pipelines in the CPUs to secure the continuation of the process.

When you want to run an application on Windows, you click the specific program you want to run by double-clicking an icon. But one program normally exists out of multiple smaller programs and by clicking the icon all these programs are sent to the CPU. All these small programs are broken down into little pieces of code which are sent to the memory for temporary storage until they can be handed to the CPU. Because a CPU has a much higher speed than RAM, you are bound to get pipeline stalls because there are not enough pipelines to process everything at once. To prevent stalling, CPU architects invented high-speed RAM which was directly built into the CPU. This is called SRAM (Static Random Access Memory). SRAM preloads as many pieces of commands as possible and keeps copies of commands that were already processed at a station in case the CPU needs to work on it again. This use of SRAM is also called Cache.

There are multiple caches on a CPU: 

  • L1 cache: this is the first cache that is checked by the CPU. Because of this, it is the quickest cache. The L1 cache is also the smallest cache.
  • L2 cache: the second cache that is checked by the CPU and therefore the second quickest cache. The L2 cache is a bigger cache than the L1 cache.
  • L3 cache: the third cache that is checked by the CPU and because of this it is the slowest cache. The L3 cache is the biggest cache.

Multicore processing

I think most of the people that have worked with computers know of the terms single-core, dual-core, triple-core, quadcore, etc. 

“Multicoring” is the integration of 2+ CPUs (or cores) into one single chip. This way you create a multiple core architecture. Using multiple cores has a very big advantage. You can compare it with working on multiple calculators at the same time. If you take a dual-core chip you will have 2 CPUs which means 2 execution units and 2 sets of pipelines. Multiple cores don’t have dedicated caches and RAM: they share these resources.   

I’m not sure what the highest number of current cores is but I have seen CPUs up to 64 cores (AMDs Ryzen Thread Ripper).

Final thoughts

I hope this post showed the importance of CPU architecture. There are more improvements/enhancements that have been implemented but the ones I named in this post are the key items that had the most impact on the evolution of the CPU in my opinion. 

Without all the upgrades that CPU architects applied, we would not have progressed so much in technology as we now have. I hope this information gives a better understanding of how important CPU architecture is because without it we will not be able to move forward anymore. 

I’m very curious what new improvements/enhancements the CPU architects will show us in the near future and I will certainly keep an eye on Intel, AMD and ARM. With all these upgrades in the past, I am of the opinion that it won’t stop there and that our next-generation chips might lead to massive technological breakthroughs. 

One of the latest developments is chips that are designed by AI. This development is still in its infancy but I think that AI will be of great help in assisting CPU architects in continuously upgrading CPUs in order to be able to meet the ever-increasing demands of the technology industry.

Feel free to contact me if you have any questions or if you have any additional advice/tips about this subject. if you want to keep in the loop if I upload a new post, don’t forget to subscribe to receive a notification by e-mail.  

Leave a Reply