Monday, August 22, 2016

Two days to go until AMD's Hot Chips presentation - how about Zen's core size?

TL;DR: Zen's many integer schedulers might be related to dedicated dependency chain handling. And the Zen core might be just 4.9 mm² incl. L2 cache.

Here are some last speculative thoughts before we'll hear about Zen at Hot Chips 28, from the guy, who told you first about Bulldozer's and Zen's microarchitectures, AMD's upcoming 32 core server chip, and some other interesting things. Now I can say this, as AMD did present a first view on Zen's microarchitecture just a day after my last blog posting. Again, I was pretty close. This simply depends on the amount of data found in patches and patents.

However, some things are different: The speculated Zen microarchitecture shown here on this blog had a unified scheduler for the 4 ALUs and a second one for the AGUs, while Zen actually has 6 separate schedulers instead. The base for my assumption was the cat core heritage. But of course, a unified scheduler for 4 execution units holding lots of µOps is still a big step up from a scheduler, which only has to issue to two units (cat cores). Now what's the reason for this configuration? At first, there are many possible typical design trade-off related reasons, like area, delays, buffer sizes, power efficiency. But there are also some interesting concepts, like a dependency chain oriented handling of instruction streams. If some code has an instruction level parallelism greater than one, there are groups of instructions, which at some point could be executed independently of the main flow, until their result gets fed back. These groups and sections of the main flow could also be called dependency chains, where it is not possible to execute a newer instruction before an older one, as each of them in the chain depends on some result of its predecessing operation. Here is an example:
mov eax, [edi]
add eax, ecx
imul edi, eax
cmp [ebp+08h], edi
jnz out
This code actually can't be executed out-of-order. All the logic put into an out-of-order scheduler would be a waste of energy in this case. And multiple same type execution units for parallel issue wouldn't help to speed this up either. A single scheduler with an integer execution unit (IEU), and an address generation unit (AGU) would be enough. The latter wouldn't even need a separate scheduler, similar to the K7, K8, K10 series of CPUs. This could be one reason for Zen's individual schedulers, as one identified dependency chain could be sent to a single integer scheduler and one AGU scheduler if there are memory operands. The other schedulers might even be clock gated then.

U.S. Patent No. 8769539 covers a scheduler, which can be switched between out-of-order and in-order operation. One of the inventors is Zen project leader Suzanne Plummer, while Dan Hopper is also an important member of the Zen x86 core design team. In combination with many other patents (for example  US20120023314), which cover dependency chain related logic, there might be such a scheduler in Zen.

Knowing the dependency chain also offers several efficiency measures. One patent covered different latencies for "far source operands" and close ones, i.e. coming from a different "lane" or the same. Bypass networks could be implemented in a somewhat more relaxed way, which improves delay and power efficiency.

A Zen core size estimation

After talking about Zen's die size just days ago, there is another size, which likely will be revealed at Hot Chips: the size of a Zen core. Earlier this year I created a table to estimate that size based on Excavator module components and some scaling factors. Based on AMD's statement about a density optimized process, one of the unknowns in this calculation just became a bit smaller. Their statement could both mean dense metal layers or high density standard cell libs. For simplicity and lack of further data, assuming no density related scaling should do. Process related scaling is a different story, though.

Using die photos, it is possible to measure the size of a graphics CU. On a Polaris 10 die, the size of a graphics CU is about 2.96 mm², while Carrizo has 7.21 mm² CUs. This results in a scaling factor of about 41%. Putting this all together with some individual scaling factors based on design changes (e.g. more ALUs, smaller multiplier arrays, 64KB L1 I$ - already included in the "area 1C" number), results in a Zen core size incl. L2 cache of about 4.9 mm².

Zen core size estimation based on Excavator data

21 comments:

Arnawa Widagda said...

http://www.pcgameshardware.de/AMD-Zen-Codename-261795/Specials/Architekturdetails-Benchmark-IPC-1205041/galerie/2625556/

Dresdenboy said...

Yep, just hours after my posting. It seems, only German media published them, BTW.

Arnawa Widagda said...

Probably cause it's early in the week.
I do wish there were more details about the front end, but I can understand if AMD decided not go into details at this moment.
Care to write up an analysis on the front end? Looks far more interesting than the execution parts.

Unknown said...

Matthias,
can you do an overall post with all the info we know about Zen till now?Where Zen will lack and where will shine against Intel CPU´s.
I think the lack of 256SIMD will have a big impact on power consumption and the fast cache will deliver a great performance, but I fear the expectations are going sky high.

Tacit Murky said...

«This code actually can't be executed out-of-order» — except for loading this [ebp+08h] argument.

Generally, I think 6 RS's is a bad idea. A core would have less opportunity to issue ready mops to execution; while power-saving can be attained with less sectioning of RS's. Practically, upgraded Cat's version would've be fine: data RS (4 ports), AG/stack RS (2 ports), flags/jumps RS (1 port). Two later ones can be unified in a 3-ported RS. Then it'll be possible to execute float/vector code without waking up large data RS.

128-bit paths for vectors seems degrading. This will have negative effect for both speed and power for all AVX code. It's just repeating one of many BD mistakes…

quickbooks help said...

It is natural for you to have some teething problems with some features of the program. If and when this happens, you can get solutions to your Quickbooks related issues by reaching our popular and reliable Quickbooks help center.

simran said...

For any kind of inforamtion related to printer...you can visit herecanon.com/ijsetup

emailsfixservice said...

Thanks for another wonderful post. Where else may anybody get that type of information in such an ideal way of writing? I’ve a presentation next week, and I’m on the look for such info. More Visit: AOL Desktop Gold Not Responding

Lucy_13 said...

I’m impressed, I must say. Genuinely rarely can i encounter a blog that’s both educative and entertaining, and without a doubt, you have hit the nail on the head. Your concept is outstanding; the issue is something that too little individuals are speaking intelligently about. cpm homework help
essay writer online

Sophia Jones said...

You are an experienced statistics homework help who has a bachelor's degree or a master's degree in statistics. I need your help in developing an appropriate study plan that can help me succeed in statistics. You have been there and clearly know what it takes to succeed in statistics. You can share some tips with me. Lastly let me know whether you offer SAS homework help. I am in dire need of a statistics assignment help expert with SAS knowledge.

Sophia Jones said...

People understand things differently. My brother told me that C++ was his simplest language when taking programming. When I started it proved to be my hardest language. Since I worked with your programming homework help team, I decided to hire your C++ homework help team to help me out. I want you to do all my assignments because I don’t seem to get anything in this class. I will also require some lessons from you.

Sophia Jones said...

The Matlab homework help expert who did my communication systems assignment provided a solution worse than what a high schooler would deliver. I was so frustrated because I had paid quite an amount of money for the task. Even after asking for revisions, the solution didn’t get any better. They had to assign the work to another Matlab assignment help expert and I was eventually provided with quality Matlab assignment help. Not the best experience but the second writer was really nice.

Sophia Jones said...

Do you work alone or do you work as a team? I want to hire you as my economics homework help expert but I am afraid that you may not handle the work alone. If you have other economics assignment help experts to help you then that would make more sense. I am concerned because I will not allow late work. My professor never accepts late work and therefore not delivering on time would mean a re-take.

Ella Taylor said...

Any programmer that has used STATA for a while has probably used the option for their STATA assignment. You too could testify that you have used it in your STATA assignment help services But to some, it's not that very clear on how it works. But the simplest explanation that I can give about it is that it tries to process a command by grouping the variables. I used your statistics homework help services and I know a thing or two about your STATA team.

Ella Taylor said...

Do you have a C++ homework help tutor from Canada? I have a small programming assignment that can best be sorted by a C++ expert. The only reason why I want a programming homework help experts from our country is that I want someone who clearly understands our system. If you do not have any tutor from Canada the only other country I would consider is the United States. I will be waiting for your communication on the same.

Ella Taylor said...

I want to be honest with you. I am not the type of person who easily gets annoyed at people. Three months ago I sent you an email seeking communications system assignment help. My task was completed on time but after a struggle with the first Matlab assignment help, I was assigned. He was not responsive at all. I only enjoyed working with you after I demanded for a new Matlab homework help solver. The new person I was given was amazing. Always available when needed and a good communicator. I don’t know whether you allow people to decide who should handle their assignments because if it was possible then that is what I would do for the rest of my assignments.

Ella Taylor said...

I have a very urgent assignment which is required in less than 6 hours. Part of the work is done and therefore I need an economics assignment help expert to complete part three of the question and check whether I have done the other parts correctly. My budget for it is very minimal and therefore I don’t expect you to charge me more than $50 for the small remaining part. The beauty is that if I am happy with your solutions I'll need your economics homework help services again.

Jon Hendo said...

Nice. I am really impressed with your writing talents and also with the layout on your weblog. Appreciate, Is this a paid subject matter or did you customize it yourself? Either way keep up the nice quality writing, it is rare to peer a nice weblog like this one nowadays. Thank you, check also event management and List Building Tips

Unknown said...

Thankyou for sharing such a great content. I hope that you will keep sharing such a great content like this.
Even I would like you to plz do visit our website for the courses.advanced excel certification

Simone Watson said...

This post is not only informative but impressive also, I learned new thing from this blog. This post is so persuasive that it created an urge to choose Kotlin Application Development Company . You can email us at sales@appsquadz.com or call us at +91-9717270746

Jenifer said...

Our writing panel has 2000+ academic writers to help students with their academic issues. They have been recruited after a thorough round of meetings. The vast majority of them have acquired Ph.D. degree from top most colleges of Australia. Experience the succeeding segment to realize who are our writers. visit - assignment help australia , programming assignment help