PCIe 6.0 and CXL 3.0 on all your CPUs

Tom Henry

pcie 6.0 and cxl 3.0 on all your cpus

Intel Arrow Lake will be disruptive for many reasons within what Intel hopes to do in PCs. Meteor Lake is getting closer and will debut in just over a month, and as we have seen, the first laptops are already listed with their CPUs and architecture. Well, what Intel has prepared with Arrow Lake It’s better than we could think, because it has just been leaked that they will use for the first time in history, ahead of even server CPUs, PCIe 6.0 and CXL 3.0. What does this mean on PC and gaming?

Well, it will mean better performance for several points to be addressed. The novelty is of such significance that not even the expected Granite Rapids or Sierra Forest They are going to have these new features, and from what it seems, the way Intel is going to manage it looks like it will be really disruptive in tasks like gaming.

Intel Arrow Lake will debut PCIe 6.0 and CXL 3.0 for the first time in CPU history


If AMD does not remedy it, and for now nothing suggests that this will be the case despite the leak we saw months ago about the fact that the reds wanted to bring them to their CPUs, Intel will be the first to include these two standards that are widely expected. Curiously, AMD spoke about this topic almost exactly a year ago with the help of Leah Schoeb, a topic that we discussed in depth after the brief statements of many industry players, but at the moment there are no more statements from the red team.

Therefore, today’s leak is extremely relevant, because Arrow Lake with PCIe 6.0 and CXL 3.0 It is going to put pressure on AMD in a sector where the reds did not expect it, since it should have debuted in servers first due to its nature and the bottlenecks presented by its platforms, but no, it will be on PC. And of course, the most obvious question is, why?

Latency and memory coherence, a problem that CPUs suffer


We have said it and we will maintain it until the tables turn: the main problem of a CPU for gaming today (and almost anyone in general, except for a few cases) It’s latency. It is what kills performance, the most difficult to scale, the most complex to reduce.

Therefore, Intel wants to improve this, and the answer was clear given the leaps to the MCM and real 3D architecture with Foveros: Arrow Lake would have to include PCIe 6.0 and CXL 3.0 to alleviate and improve these points.

AMD has been suffering from access times to its different chiplets for years, where faster RAM helps alleviate the deficit with the Infinity Fabric, both in frequency and latency. The dependence on reds is well known, the increase in memory performance is almost proportional to the improvement we will have in tasks such as gaming, and this is subject to the latency problem.

Intel didn’t have it until now because its dies were monolithic, but when jumping to MCM with Meteor Lake and Arrow Lake it will suffer a very similar problem. The way to alleviate and improve it is undoubtedly CXL 3.0.

Why CXL 3.0 and not 2.0 when the latter is mature?


For several reasons. First and most obvious is that CXL 3.0 is subject to PCIe 6.0, which will mean a new high-performance PHY on the Arrow Lake Tile SoC. As the requirement is non-negotiable to be able to have CXL 3.0, Intel hits the table for next year with its new platform and adds not only the new PCI-SIG standard for the future RTX 50 and RTX 60, as well as RX 8000 and RX 9000, but will work with PCIe 6.0 on the GPU and PCIe 5.0 on the SSD, achieving a mix of coherence that can only be guaranteed with CXL 3.0.

Let’s forget the 128 GB/s that the standard will offer, it is totally secondary on PC, so what is Intel looking for? First of all, FEC (Forward Error Correct) and CRC (Cyclic Redundancy Check). That is, look for low latency error correction.

The objective is precisely this, to reduce latency between devices by sharing a coherence cache between them, totally transparent. It must be understood that CXL 3.0 achieves greater access modes, much larger topologies and, in this case, much more flexible memory sharing in what the symposium calls Enhanced coherencyand here is the key to the whole matter for Arrow Lake CPUs with PCIe 6.0 and CXL 3.0.

Enhanced coherency, the key to Arrow Lake CPUs with CXL 3.0


Understood that to include CXL 3.0 in Arrow Lake, PCIe 6.0 and a new PHY in it SoC Tile, we are going to explain why Intel needs it before the server. As we said, Enhanced Coherency is the key.

This new term is an evolution of memory coherence and debuts in the new standard. To simplify the idea, we need to know that until CXL 2.0 to invalidate cached data It was necessary to have a Host or a device in charge of access and control of them.

This not only took up bandwidth and cycles on the part of said devices and hosts to control the entire frame, but also implied a latency that came from CXL 1.0. With Enhanced Coherency now allow the device itself invalidates the data stored in the caches.

That is, the host stores the data and the device itself can now invalidate it if necessary. This implies that both sender and receiver can invalidate them, which is the definitive step to share this extremely fast memory, without reducing the performance of the busiest device.

In short, if the SSD sends data to the CPU cache in the Arrow Lake architecture, now the SSD itself can invalidate it if necessary, it does not have to waste cycles on the CPU and its host. This has been termed by the consortium as Direct Memory Access for Peer-to-Peerand it is the great key to alleviating the latency deficit involved in making the move to MCM as a conceptual architecture.

The Tiles as one


It is the big key, with a coherent memory system as proposed by Intel in its Tiles within Arrow Lake, what will be achieved is the omission of the host, but for this the architecture would need request switches to come into play and complete the strategy.

This is more than likely not implemented on PC, although it will be implemented on the server with Diamond Rapids. The answer is simple for this: a PC does not currently need switches of any kind (except as a surprise as an essential requirement of the standard, which although it is cited, talks about devices, not Tiles).

The latency will be alleviated, yes, at the cost of bandwidth, but here it will be more than enough with those mentioned 128 GB/s bidirectional. In servers this will not be the case and switches will be necessary to avoid losing bandwidth, which also obviously makes the final product and the platform more expensive.

The important thing is that Enhanced Coherency will introduce true memory sharing to Arrow Lake CPUs with PCIe 6.0 and CXL 3.0, where multiple hosts within the PC can have a shared copy of the data and invalidate it from one side so that the Synchronization is always maintained by reducing the latency that occurs when making the jump to MCM.


That is, the Tiles will work with a latency equal to or even lower in certain cases than a traditional monolithic system such as the one implemented by the i9-14900K, for example. With this, Intel is going to put an end to the main problem that would reduce performance, and since Granite Rapids includes chiplets and latency is alleviated by EMIB As AMD does with its Infinity Fabric, in servers, Intel will have to wait until the aforementioned Diamonds Rapids to offer all the new features that we will have on PC.

Leave a Comment