# AMD's processor lines belonging to the Intermediate Families (Families 11h (Griffin) and 12h (Llano))

# Dezső Sima

# October 2018

(Ver. 1.1)

© Sima Dezső, 2018

## AMD's processor lines belonging to the Intermediate Families (Families 11h (Griffin) and 12h (Llano))

- 1. Overview of AMD's Intermediate families
- 2. The Family 11h (Griffin) Family
- 3. The Family 12h (Llano) Family
- 4. References

# 1. Overview of AMD's Intermediate Families

## 1.1 Overview of AMD's Intermediate families (1)

#### **1.1. Overview of AMD's Intermediate families**



#### **Brand names of AMD's Intermediate families**

|       | Launched in                         | 2008-2009                                                                                | 2011                            |
|-------|-------------------------------------|------------------------------------------------------------------------------------------|---------------------------------|
|       |                                     | Family 11h<br>(Griffin)                                                                  | Family 12h<br>(Llano)           |
| Ņ     | 4P servers                          |                                                                                          |                                 |
| ver   | 2P servers                          |                                                                                          |                                 |
| Ser   | 1P servers                          |                                                                                          |                                 |
|       | (85-140 W)                          |                                                                                          |                                 |
| SC    | <b>High perf.</b><br>(~95-125 W)    |                                                                                          |                                 |
| sktol | Mainstream<br>(~65-100 W)           |                                                                                          | Llano A8/A6/A4/E2<br>Sempron X2 |
| De    | Entry level<br>(40-60 W)            |                                                                                          |                                 |
| S     | <b>High perf.</b><br>(~30-60 W)     | Turion X2 Ultra (ZM-xx)<br>Turion X2 (RM-xx)                                             | Llano A8 M                      |
| book  | Mainstream/Entry<br>(~20-30 W)      | Athlon X2 (QL-xx)<br>Sempron (SI-xx)                                                     | Llano A6/A4/E2 M                |
| Note  | <b>Ultra portable</b><br>(~10-15 W) | Turion Neo X2 (L6xx)<br>Turion X2 (RM-xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U) |                                 |
|       | Tablet (~5 W)                       |                                                                                          |                                 |
|       | <b>Embedded</b><br>(~10 - 20 W)     | Turion Neo X2 (L6xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U)                      |                                 |

#### Main features of Intermediate, Cat and Zen-based mainstream mobile lines

| Base arch./<br>stepping                   | Intro   | High perf.<br>mobile family<br>name | Series                 | Techn. | Core<br>count<br>(up to)                                    | L2<br>(up to)                       | L3            | Memory<br>(up to) | TDP<br>[W] | Sock<br>et   |
|-------------------------------------------|---------|-------------------------------------|------------------------|--------|-------------------------------------------------------------|-------------------------------------|---------------|-------------------|------------|--------------|
| Family 11h<br>(K11)<br>(Griffin)          | 6/2008  | Lion (no APU)<br>(not SoC)          | Turion<br>X2<br>Ultra  | 65 nm  | 2                                                           | 2x512<br>KB/<br>2*1 MB <sup>2</sup> | -             | DDR2-800          | 31-35      | S1g2         |
| Family 12h<br>(K12)<br>(Llano)            | 6/2011  | Llano (APU)<br>(not SoC)            | Fusion<br>A8 M         | 32 nm  | 4                                                           | 4x1 MB                              | -             | DDR3-1600         |            | FM1          |
| <b>Family14h</b><br>(00h-0Fh)<br>(Bobcat) | 1/2011  | Zacate (APU)<br>(not SoC)           | Fusion<br>E            | 40 nm  | 2                                                           | 2x512<br>KB                         | -             | DDR3L-1333        |            | FT1<br>(BGA) |
| Family 16h<br>(10H-1fH)<br>(Jaguar)       | 5/2013  | Kabini APU<br>(SoC)                 | A<br>Series            | 28 nm  |                                                             | 2 MB<br>shared                      | -             | DDR3L-1866        |            | FT3          |
| <b>Family 16h</b><br>(30H-3fH)<br>(Puma+) | 4/2014  | Beemai APU<br>(SoC)                 | A4<br>Series           | 28 nm  | 4 cores<br>with a<br>shared L2<br>cache                     | 2 MB<br>shared                      | -             | DDR3L-1866        |            | FT3b         |
| <b>Family 16h</b><br>(30H-3fH)<br>(Puma+) | 5/2015  | Carrizo-L APU<br>(SoC)              | A8/A6/<br>A4<br>Series | 28 nm  |                                                             | 2 MB<br>shared                      | -             | DDR3L-1866        |            | FP4          |
| <b>Family 17h</b><br>(00H-0fH)<br>(Zen)   | 10/2017 | Raven Ridge<br>APU<br>(SoC)         | Ryzen<br>7/5/3         | 14 nm  | 4-core<br>CCX with<br>private L2<br>caches and<br>shared L3 | ½ MB<br>/core                       | 1 MB/<br>core | DDR4-2400         |            | AM4          |

APU: Accelerated Processing Unit (CPU +GPU) CCX: Core CompleX

<sup>2</sup>: 2\*512 KB for Turion X2, 2\*1 MB for Turion X2 Ultra

## 2. The 11h Griffin Family

- 2.1 Overview of the 11h Griffin Family
- 2.2 Main enhancements of the 11h Griffin Family
- 2.3 The 11h Griffin mobile lines

# 2.1 Overview of the 11h Griffin Family

## 2.1 Oveview of the 11h Griffin family

- Introduced: in 6/2008, it preceded the K10.5 Shanghai-based mobiles
- Processor family designed solely for mobiles
- Derived from the 65 nm K8 Brisbane-based Turion 64 X2 (Tyler die) [4]
- Design emphasis on power saving
- Until now it is AMD's only design with per processor power planes, i.e. with fully independent per processor P-state control
- AMD's first processor family supporting HyperTransport 3.0
- 2C, 65 nm technology
- 160 mm<sup>2</sup>, 225.6 mtrs

#### AMD's 11h Griffin-based mobile lines – Overview-1 [2]



#### AMD's 11h Griffin-based mobile lines – Overview-2 [2]



## Brand names of Family 11h (Griffin)-based mobile lines

|                                 | Launched in                         | 2008-2009                                                                                | 2011                            |
|---------------------------------|-------------------------------------|------------------------------------------------------------------------------------------|---------------------------------|
|                                 |                                     | Family 11h<br>(Griffin)                                                                  | Family 12h<br>(Llano)           |
| ň                               | 4P servers                          |                                                                                          |                                 |
| vei                             | 2P servers                          |                                                                                          |                                 |
| Ser                             | 1P servers                          |                                                                                          |                                 |
|                                 | (85-140 W)                          |                                                                                          |                                 |
| SC                              | <b>High perf.</b><br>(~95-125 W)    |                                                                                          |                                 |
| sktol                           | Mainstream<br>(~65-100 W)           |                                                                                          | Llano A8/A6/A4/E2<br>Sempron X2 |
| De                              | <b>Entry level</b><br>(40-60 W)     |                                                                                          |                                 |
| S                               | High perf.<br>(~30-60 W)            | Turion X2 Ultra (ZM-xx)<br>Turion X2 (RM-xx)                                             | Llano A8 M                      |
| book                            | Mainstream/Entry<br>(~20-30 W)      | Athlon X2 (QL-xx)<br>Sempron (SI-xx)                                                     | Llano A6/A4/E2 M                |
| Note                            | <b>Ultra portable</b><br>(~10-15 W) | Turion Neo X2 (L6xx)<br>Turion X2 (RM-xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U) |                                 |
|                                 | Tablet (~5 W)                       |                                                                                          |                                 |
| <b>Embedded</b><br>(~10 - 20 W) |                                     | Turion Neo X2 (L6xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U)                      |                                 |

## Use of the 11h Griffin-based mobile lines in AMD's mobile platforms [5]



Roadmap subject to change without notice

(Luko)

6 | AMD Financial Analyst Day | November 13, 2008



#### Main features of AMD's 11h Griffin-based high performance Turion Ultra mobile line

| Base a<br>stepp                            | rch./<br>ping                    | Intro   | High perf.<br>mobile<br>family | Series              | Techn. | Core<br>count<br>(up to) | L2<br>(up to)                  | L3 | Memory<br>(up to) | HT/ dir.<br>(up to)   | Sock<br>et |
|--------------------------------------------|----------------------------------|---------|--------------------------------|---------------------|--------|--------------------------|--------------------------------|----|-------------------|-----------------------|------------|
| К8                                         | C0, CG                           | 9/2003  | Claw-<br>hammer                | Mobile<br>Athlon 64 | 130 nm | 1                        | 512 KB                         | -  | DDR-400           | HT 1.0:<br>3.2 GB/s   | 754        |
|                                            | E5                               | 3/2005  | Lancaster                      | Turion 64           | 90 nm  | 1                        | 1 MB                           | -  | DDR-400           | HT 1.0:<br>3.2 GB/s   | 754        |
|                                            | F2                               | 5/2006  | Trinidad                       | Turion 64<br>X2     | 90 nm  | 2                        | 2*1⁄2 MB                       | -  | DDR2-667          | HT 1.0:<br>3.2 GB/s   | S1         |
| K10.5                                      | DA-C2                            | 9/2009  | Caspian                        | Turion II           | 45 nm  | 2                        | 2*1⁄2 MB/<br>2*1 MB1           | -  | DDR2-800          | HT 3.0:<br>7.2 GB/s   | S1g3       |
|                                            | DA-C3                            | 5/2010  | Champlain                      | Turion X4           | 45 nm  | 4                        | 4*1⁄2 MB                       | -  | DDR3-<br>1066     | HT 3.0:<br>7.2 GB/s   | S1g4       |
| Fam. 11<br>(Griffin)                       | B1                               | 6/2008  | Lion                           | Turion X2<br>Ultra  | 65 nm  | 2                        | 2x½ MB/<br>2*1 MB <sup>2</sup> | -  | DDR2-800          | HT 3.0:<br>10.4 GB/s  | S1g2       |
| Fam. 12<br>(Llano)                         | B0                               | 6/2011  | Llano<br>APU                   | Fusion A8           | 32 nm  | 4                        | 4x1 MB                         | -  | DDR3-1600         | -                     | FM1        |
| <b>Famil</b><br>Mod. 10<br>(Piledr         | <b>y 15h</b><br>Dh-1Fh<br>river) | 5/2012  | Trinity<br>APU                 | A10-A4 M            | 32 nm  | 2 CM<br>(4C)             | 2 MB/CM                        | -  | DDR3-<br>1600     | PCIe 2.0x16<br>UMI    | FS1r2      |
| Family 15h<br>Mod. 20h-2Fh<br>(Piledriver) |                                  | 2/2013  | Richland APU                   | A10-A4 M            | 32 nm  | 2 CM<br>(4C)             | 2 MB/CM                        | -  | DDR3-1866         | PCIe 2.0x16<br>???UMI | FS1r2      |
| Family 16h<br>(Jaguar)                     |                                  | H1/2013 | Kabini APU<br>(SOC)            | A/E Series          | 28 nm  | 2 CM<br>(4C)             | 2 MB<br>shared                 | -  | DDR3-<br>1866     | PCIe ???              | BGA        |

<sup>1</sup>: 2\*512 KB for Turion II, 2\*1 MB for Turion II Ultra
<sup>2</sup>: 2\*512 KB for Turion X2, 2\*1 MB for Turion X2 Ultra

UMI: Universal Media Interface

#### **Block diagram of Griffin** [29]



# 2.2 Main enhancements of the 11h Griffin Family

## 2.2 Main enhancements of the 11h Griffin Family [3]



#### Main power saving techniques used in Griffin [6]

Cool'n'Quiet 2.0 (as introduced into then K10 Barcelona-based desktops (Phenom line) with a few enhancements intended for mobile use



#### a) Independent Dynamic Core Technology

It means both separate voltage planes and clock domains [3]



### Number of available frequency and voltage levels [28]

There are up to 8 frequeny levels and 4 voltage levels evailable for each core.

## Increased battery life with reduced power consumption

- Separate voltage planes for each core
- Each core can operate at independent frequency and voltage

## Increased performance with instantaneous frequency transitioning

- Optimized for Windows Vista<sup>™</sup>
- Minimize power consumption by always running at optimal power state
- Lower operating minimum power state
- Reduced processor utilization with simplified transitions



## Increased battery life with advanced power management features

## **Instantaneous frequency transitions** [28]

Achieved with a single shared high-speed PLL and programmable dividers for each core. The PLL frequency is fixed at the part's maximum frequency.

The dividers divide the maximum frequency by n/2, where n is any integer of a given set [30].



## 2.2 Main enhancements of the 11h Griffin Family (6)

Percentage power savings resulting from iindependent core voltage and frequency planes [30]



Figure 4. Griffin's percentage power savings resulting from independent CPU voltage and frequency planes.

### b) CoolCore Technology

- Introduced already in K10 Barcelona-based processors (2007).
- It turns off blocks of logic (e.g. execution units) if not in use in order to save power.

## c) Power optimized HT 3.0 [28]

Dynamic scaling of HT link width down to 0-bit ("disconnected") in both directions from and to the chipset.



## d) Mobile optimized Memory Controller [28]



## 2.2 Main enhancements of the 11h Griffin Family (10)

## e) Multi-point on-die thermal sensors on the Griffin die [30]



# 2.3 The 11h Griffin mobile lines

#### 2.3 The 11h Griffin mobile lines



#### AMD's Family 11h mobile lines – Overview-2

|                   |                                           |                          | 6(08                                                                                                                                                                                  |
|-------------------|-------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Mainstream mobile | Turion X2<br>Ultra                        | 2C<br>2*1 MB             | Lion<br>160 mm <sup>2</sup> /225.6 mtrs<br>Up to DDR2-800 (Dual channel)<br>Up to HT 3.0 8.8 GB/s<br>32-35 W/S1g2<br>ZM-80-86                                                         |
|                   | Turion X2                                 | 2*512 KB                 | RM-70-77                                                                                                                                                                              |
|                   | Athlon X2<br>mobile                       | 2C<br>2*512 KB           | QL-60-67                                                                                                                                                                              |
|                   | Sempron<br>mobile                         | 1С<br>512 КВ             | 6/08           Sable1           160 mm²/225.6 mtrs           Up to DDR2-667 (Dual channel)           Up to HT 3.0 7.2 GB/s           25 W           S1g2, No AMD-V           SI-40/42 |
| Ultr              | Turion<br>Neo X2                          | 2С<br>2*512 КВ           | Conesus (Lion based?)<br>? mm <sup>2</sup> / ? Mtrs<br>Up to DDR2-667 (Dual channel)<br>Up to HT 3.0 6.4 GB/s<br>18 W/ASB1<br>L625                                                    |
| o d t             | Turion X2<br>mobile                       | 2C<br>2*512 KB           | L510                                                                                                                                                                                  |
| rtable            | 2C<br>Athlon <u>2*512 KB</u><br>Neo X2 2C |                          | Yukon platform   L325                                                                                                                                                                 |
| œ                 | Sempron                                   | 2т∠эо кв<br>1С<br>256 КВ | Huron (Sable based?)<br>? mm²/? mtrs<br>Up to DDR2-800 (Dual channel)<br>200U: 8W/ASB1<br>210U: 15W/S1q2<br>2009                                                                      |

## Example: Main features of the Lyon-based Turion X2 Ultra ZM-8x line [1]

| Model Number                         | Frequency | L2-Cache | HT       | Multiplier <sup>1</sup> | Voltage    | TDP  | Socket         | Release<br>date |
|--------------------------------------|-----------|----------|----------|-------------------------|------------|------|----------------|-----------------|
| Turion X2 Ultra ZM-80 <sup>[2]</sup> | 2100 MHz  | 2 × 1 MB | 1800 MHz | 10.5x                   | 0.75V-1.2V | 32 W | Socket<br>S1G2 | June 4, 2008    |
| Turion X2 Ultra ZM-82 <sup>[2]</sup> | 2200 MHz  | 2 × 1 MB | 1800 MHz | 11x                     | 0.75V-1.2V | 35 W | Socket<br>S1G2 | June 4, 2008    |
| Turion X2 Ultra ZM-84 <sup>[2]</sup> | 2300 MHz  | 2 × 1 MB | 1800 MHz | 11.5x                   | 0.75V-1.2V | 35 W | Socket<br>S1G2 | Q3 2008         |
| Turion X2 Ultra ZM-85 <sup>[2]</sup> | 2300 MHz  | 2 × 1 MB | 2200 MHz | 11.5x                   | 0.75V-1.2V | 35 W | Socket<br>S1G2 | Q3 2008         |
| Turion X2 Ultra ZM-86 <sup>[2]</sup> | 2400 MHz  | 2 × 1 MB | 1800 MHz | 12x                     | 0.75V-1.2V | 35 W | Socket<br>S1G2 | June 4, 2008    |
| Turion X2 Ultra ZM-87 <sup>[2]</sup> | 2400 MHz  | 2 × 1 MB | 2200 MHz | 12x                     | 0.75V-1.2V | 35 W | Socket<br>S1G2 | Q3 2008         |
| Turion X2 Ultra ZM-88 <sup>[2]</sup> | 2500 MHz  | 2 × 1 MB | 1800 MHz | 12.5x                   | 0.75V-1.2V | 35 W | Socket<br>S1G2 | Q3 2008         |

#### The Puma platform [29]



#### **Overview of the Congo and Yukon platforms** [31]



## 2.3 The 11h Griffin mobile lines (6)

#### **Example for the Conesus-based Congo platform**

Block diagram of the Lenovo ThinkPad Edge 0221-RY6 notebook [32]





#### Remark

There are no detailed specifications available for the Huron and Conesus processors as far to decide whether they are Griffin-based or not.

Nevertheless, it can be assumed that the ultra portable notebooks aimed

- Conesus (2C, 2x512 KB, 18 W) and
- Huron (1C, 256 KB, 8-15 W)

processors are lower clocked downsized versions of the value notebook aimed

- Lion (Athlon X2 mobile QL 6x (2C, 2x512 KB, 35 W) and
- Sable (1C, 512 KB, 25 W)

processors.

#### **Block diagrams of the Conesus and Huron dies**





## 3. The 12h Llano Family

3.1 Introduction to the 12h Llano Family

- 3.2 The microarchitecture of the Family 12h Llano lines
  - 3.3 Microarchitecture enhancements aiming at increasing performance
- 3.4 Microarchitecture enhancements aiming at reducing power consumption
- 3.5 Family 12h (Llano)-based Fusion desktop lines
- 3.6 AMD's Family 12h (Llano)-based Fusion mobile lines
# 3.1 Introduction to the 12h Llano Family

## 3.1 Introduction to the 12h Llano Family [4]

- Introduced: 6/2011.
- This processor family belongs to the Fusion APU (Accelerated Processing Unit) series. The Fusion APU series include two to four x86 CPUs and a GPU to accelerate vision computing (graphics and media).
- Family 12h Llano processors do not include an L3 cache, but typically a GPU.
- Family 12h Llano covers desktop and mobile lines, as shown later.
- Processors of the Llano lines have up to 4 CPU cores and a GPU.
  Nevertheless, AMD sells Llano based desktop lines as well with disabled GPUs.
  These lines are branded as Athlon II X4/X2 or Sempron lines.
- 32 nm technology, 228 mm<sup>2</sup>, 1450 mtrs.

#### **Overview of AMD's K10 - Family 16h based processor lines** [2]



AMD's Family 12h Llano-based desktop and mobile lines – Overview-2 [2]



## Brand names of Family 12h (Llano)-based desktop and notebook lines

|                                 | Launched in                      | 2008-2009                                                                                | 2011                            |
|---------------------------------|----------------------------------|------------------------------------------------------------------------------------------|---------------------------------|
|                                 |                                  | Family 11h<br>(Griffin)                                                                  | Family 12h<br>(Llano)           |
| Servers                         | 4P servers                       |                                                                                          |                                 |
|                                 | 2P servers                       |                                                                                          |                                 |
|                                 | 1P servers                       |                                                                                          |                                 |
|                                 | (85-140 W)                       |                                                                                          |                                 |
| Desktops                        | <b>High perf.</b><br>(~95-125 W) |                                                                                          |                                 |
|                                 | Mainstream<br>(~65-100 W)        |                                                                                          | Llano A8/A6/A4/E2<br>Sempron X2 |
|                                 | <b>Entry level</b><br>(40-60 W)  |                                                                                          |                                 |
| Notebooks                       | High perf.<br>(~30-60 W)         | Turion X2 Ultra (ZM-xx)<br>Turion X2 (RM-xx)                                             | Llano A8 M                      |
|                                 | Mainstream/Entry<br>(~20-30 W)   | Athlon X2 (QL-xx)<br>Sempron (SI-xx)                                                     | Llano A6/A4/E2 M                |
|                                 | Ultra portable<br>(~10-15 W)     | Turion Neo X2 (L6xx)<br>Turion X2 (RM-xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U) |                                 |
| Tablet (~5 W)                   |                                  |                                                                                          |                                 |
| <b>Embedded</b><br>(~10 - 20 W) |                                  | Turion Neo X2 (L6xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U)                      |                                 |

## 3.1 Introduction to the 12h Llano Family (5)

### K12 (Llano)-based processor lines – Overview-3 [11]



#### **Desktop/mobile Fusion APU models**

• Lynx desktop platform (65/100 W)

**Fusion A/E2 series** (6/2011-12/2011) 2/4 C, 512 KB/1 MB per C, GPU, 65/100 W

Turbo Core on select desktop models

• Sabine mobile platform (35-45 W)

**Fusion A/E2 M/MX series** (6/2011-12/2011) 2/3/4 C, 512 KB/1 MB L2 per C, GPU, 35/45 W

Turbo Core on all mobile models

**GPU-less desktop models** 

Lynx desktop platform (65/100 W)

- Athlon II X4 6xx, (8/2011- 2/2012) 4C, 1 MB/C, 65/100 W
- Athlon II X2 221, (8/2011) 2C, 512 KB/C, 65 W
- Sempron 198, /8/2011) 2C, 512 KB/C, 65 W

GPU disabled, no Turbo Core

Socket FM1

## The Vision branding [4]

Desktop and mobile lines based on Fusion APUs are categorized into four Vision brands according to their MM and graphics capabilities (E2, A4, A6 and A8 series) as follows:

- E2 series Skip-free HD-playback (1080p High-Definition smooth playback)
- A4 series Easy photo editing
- A6 series Fast video editing
- A8 series Ultra-realistic 3D gaming

### **Example: AMD's Llano-based A-series mobile lines** [9]

| 35W<br>Model | x86<br>Cores | L2<br>Total | Radeon™<br>Cores | Base<br>Frequency | Boost<br>Frequency |
|--------------|--------------|-------------|------------------|-------------------|--------------------|
| A8           | 4            | 4 MB        | 400              | 1.5 GHz           | 2.4 GHz            |
| A6           | 4            | 4 MB        | 320              | 1.4 GHz           | 2.3 GHz            |
| A4           | 2            | 2 MB        | 260              | 1.9 GHz           | 2.5 GHz            |
|              |              |             |                  |                   |                    |

## Remark [10]

In fact, AMD introduced the Vision branding already in 2009 to characterize the visual capabilities of their mainstream laptops, by differentiating between

- Vision Basic
- Vision Premium
- Vision Ultimate and
- Vision Black

branding.

To avoid consumer confusion AMD redefined their Vision branding in 6/2011, along with the introduction of the Llano based desktops and laptops, as stated above.

# 3.2 The microarchitecture of the Family 12h Llano lines

# 3.2 The microarchitecture of the Family 12h Llano lines (1)

## 3.2 The microarchitecture of the Family 12h Llano lines [12]

- Up to 4 Stars-32nm x86 Cores
  - 1MB L2 cache/core
- Integrated Northbridge
- 2 Chan of DDR3-1866 memory
- 24 Lanes of PCle® Gen2
  - x4 UMI (Unified Media Interface)
  - x4 GPP (General Purpose Ports)
  - x16 Graphics expansion or display
- 2 x4 Lanes dedicated display
- 2 Head Display Controller
- UVD (Unified Video Decoder)
- 400 AMD Radeon<sup>™</sup> Compute Units
- GMC (Graphics Memory Controller)
- FCL (Fusion Control Link)
- RMB (AMD Radeon<sup>™</sup> Memory Bus)
- 227mm<sup>2</sup>, 32nm SOI
- 1.45BN transistors



# 3.2 The microarchitecture of the Family 12h Llano lines (2)

### **The microarchitecture of Llano's CPU cores** [9]



## The block diagram of the GPU [12]



# 3.2 The microarchitecture of the Family 12h Llano lines (4)

### **Core numbers in different Llano-based A-series mobile lines** [9]



### The AMD Radeon VLIW-5 core [12]

- Includes
  - 4 Stream Cores
  - 1 Special Functions Stream Core
  - Branch Unit
  - General Purpose Registers
- 4 Stream Cores are capable of
  - 4 32-bit FP MULADD per clock
  - 4 24-bit Int MUL or ADD per clock
  - 2 64-bit FP MUL or ADD per clock
  - 1 64-bit FP MULADD per clock
- Additional special function core
  - 1 32b-FP MULADD per clock



# 3.2 The microarchitecture of the Family 12h Llano lines (6)

## I/O and display capabilities of the Llano processor [12]



- 5 x 8 Controller (5 devices and 8 Lanes)
  - x4 for UMI, 4 lanes for GPP
- 2 x 16 Controller (2 devices and 16 Lanes)
  - x16 for GFX expansion -or-
  - Up to 4 x4 DP links
- PHY lanes can be bifurcated into multiple engines/links
- Each 'engine' has independent link-frequency & linkwidth control
- Highly configurable lane allocation to support varied platform topologies



GPP: General Purpose Port UMI: Unified Media Interface (Connection to Fusion Control Hub)

## Llano's microphotograph [13]

### Llano

- 32 nm technology,
- 228 mm2, 1450 mtrs.

## Die plot of the Llano processor [12]



# 3.2 The microarchitecture of the Family 12h Llano lines (9)

# Die plot of a single Llano core [14]

(L2 cache not included)





3.3 Microarchitecture enhancements aiming at increasing performance

### 3.3 Microarchitecture enhancements aiming at increasing performance [15]

# 32NM "STARS" CPU CORE

#### 32nm

- Up to four x86 cores and 4x1MB L2
- >6% IPC improvement from previous x86 generation
  - Larger L2
  - Improved HW prefetcher (IP-based)
  - Bigger window size (larger reorder and load/store buffers)
  - Hardware divider
  - More...
- AMD Turbo Core support



35 & 45W Notebooks, 65 & 100W Desktops 1.4-2.9GHz CPU 400-600MHz GPU

AMD

### a) Improved HW prefetcher (IP-based)

Enhanced vs. the K10.5 data prefetcher. It will not be discussed here, details can be found in [12].

### b) Larger instruction window sizes-1 [12]

- To increase ILP (Instruction Level Parallelism) Llano has
  - 12 more micro-ops in the reorder buffer (84 micro-ops in total) and
  - 6 more micro-ops in the reservation stations (30 micro-ops in total)

# 3.3 Microarchitecture enhancements aiming at increasing performance (4)



## b) Larger instruction window sizes-2 [12]

- To compensate for the lack of an L3 cache, Llano has
  - an increased number of load/store queue entries.

### Remark

There is a confusion about the number of additional load/store queue entries;

- according to [12] there are 6 more load/store queue entries, whereas
- according to [16] Llano has 44 more load/store queue entries.

# 3.3 Microarchitecture enhancements aiming at increasing performance (6)



c) Hardware FX divider (IDIV)

# 3.3 Microarchitecture enhancements aiming at increasing performance (8)



### Performance increase of AMD's A8 APU vs. Phenom 2 X4 [12]

resulting from

- Improved HW prefetcher
- Larger instruction windows
- Hardware FX divider



# 3.3 Microarchitecture enhancements aiming at increasing performance (10)

### Conceptual difference between AMD's Fusion APU's and Intel's Sandy Bridge CPUs [17]

- Llano shows that AMD puts more emphasis on GPU performance than to CPU performance and devotes accordingly more silicon area to graphics and media than to x86 computing.
- This is in sharp contrast to Intel's position as their Sandy Bridge devotes more emphasis and silicon area to x86 computing than to graphics and media.



### Note

A comparison of both die plots reveals that for AMD GPU performance is more important than CPU performance, meaning that the CPU is "fast enough" for the vast majority of desktop and mobile configurations.

All in all, Intel's Sandy Bridge offers higher X86 performance while AMD's Llano better graphics and multimedia performance [17].

## d) Turbo Core

## **Principle of operation-1**

Chip level power consumption varies over time as both

- the number of active cores and
- the "loading" and thuis the actual power consumption of an active core

varies while running an application, as shown below.

## Number of active cores while running different kinds of applications [18]



Actual power consumption of a core vs TDP while running SPEC 2006 [18] -1



## 3.3 Microarchitecture enhancements aiming at increasing performance (15)

### Actual power consumption of a core vs TDP while running SPEC 2006 [18] -2

- According to sources [12] common workloads consume no more than 60-70 % of the available TDP.
- It follows that with an accurate real time power consumption estimation it is feasible to exploit the 20-40 % gap between the TDP and the power consumption of the application running.

### Principle of operation-2 [14]

• When chip level power consumption - averaged on a ms time scale - is less than the TDP, the remaining chip level power headroom can be utilized for boosting the frequency of the active cores.

In this way common workloads with lower power consumption can ran faster.

• Improved clock gating and power gating (to be discussed later) increase the available power headroom considerably by reducing the power consumption of inactive cores or inactive parts of the processor.
### Notes

- 1) The Turbo Core technology, as implemented in AMD's Llano processors is already the second, radically re-designed implementation of AMD's Turbo Core technology introduced in the K10.5 Istambul based desktop line (Thuban).
- 2) The implementation of AMD's Turbo Core technology is based on a patent of Naffziger [19].

# 3.3 Microarchitecture enhancements aiming at increasing performance (18)

# Principle of the implementation of the Turbo Core technology in Llano-1 (Simplified) [12]

Power consumption of the chip  $(P_{CH})$  will be determined basically by a digital power monitor.

 $P_{CH} = \sum P_{CPUi} + P_{GPU} + P_{RCP}$ 

with

 $P_{CPUi}$ : Power consumption of the CPUi  $P_{PGU}$ . Power consumption of the GPU

 $P_{RCP}$ : Power consumption of the rest of the chip



# 3.3 Microarchitecture enhancements aiming at increasing performance (19)

# Principle of the implementation of the Turbo Core technology in Llano-2 (Simplified) [18]



The power consumption of the cores is determined by

 $P_{CPUi} = P_{Leakage} + Cac \times V^2 \times fc$ 

- with P<sub>Leakage</sub>: Leakage power consumption, will be determined by silicon testing
  - Cac: Switching capacitance (will be on-line estimated as a sliding average, as described next
  - V: Actual core voltage (known value)
  - fc: Actual core clock frequency (known value)

# **Estimation of Cac (Switching capacitance)**

Per core power monitoring circuitry samples a comprehensive set of 95 signals [14].
Most of these signals are clock gater enable signals that determine whether downstream circuitry should be active or not, since the downstream clock gated circuitry either contributes to the switching capacitance of the core or not [19].



Power monitor signal locations and signal count for each core unit [14] (Some circles represent more than one signal in a given location)



# Calculation of the momentary power consumption of a core P<sub>CPUi</sub>

Let's designate the signals belonging to the core i by  $s_{ij}$  {j=0, 1, ...94}

Furthermore, let's assume for a moment that all signals of the core i are sampled at regular time intervals  $t_s$  at the same time and let's consider all sampled signal values obtained at a time  $t^k$ :  $s_{ii}^k$ 

Note that all signals are digital, so their sampled values are either 0 or 1

Then the momentary power consumption of core i at the sampling time t<sup>k</sup> can be estimated as

$$\mathsf{P}_{\mathsf{CPUi}^{\mathsf{tk}}} = \sum_{j} \, \mathsf{s}_{ij}^{\mathsf{tk}} \, \mathsf{x} \, \mathsf{w}_{ij}$$

with

 $w_{ij} \text{:}$  weights representing the contribution of the circuitry associated with the sampled signal  $s_{ij}$ 

### Principle of determining the weights

The weights are determined based on extensive post-silicon characterization rather than relying on pre-silicon power consumption models [14].

The related algorithm is [19]:



In the assumed solution a gliding average of a large enough number of momentary power consumption values (e.g. a few hundred samples) would yield a good enough estimation for the power consumption of the considered core in the given time interval.

- Nevertheless, in-parallel sampling of a large number of signals would require the routing of a large number of high speed signals whereas actual thermal time frames are in the order of ms, this allows to implement a much simpler serial sampling of the signals.
- The implemented Digital Monitor is accurate to within 2 % across a broad range of application types [14].

# Simplified layout of the digital power monitoring system [12]



# **Operation of the Turbo Core Manager (TCM)** [12]

(Simplified)

TCM calculates the energy margin (EM) both at the chip level ( $EM_{CP}$ ) and at the level of CPUs ( $EM_{CPUi}$ )

$$EM_{CP} = TDP_{CP} - P_{CH}$$

with

 $P_{CH} = \sum P_{CPUi} + P_{GPU} + P_{RCP}$ 

- TDP<sub>CP</sub>: Chip-level TDP
- P<sub>CH</sub>: Power consumption of the chip
- P<sub>CPUi</sub>: Power consumption of the CPUi
- P<sub>PGU</sub>. Power consumption of the GPU
- P<sub>RCP</sub>: Power consumption of the rest of the chip

$$EM_{CPUi} = TDP_{CPU} - P_{CPUi}$$

with

TDP<sub>CPUi</sub>: CPU-level TDP

#### Interpretation of the energy margins

Positive margins indicate power headroom Negative margins indicate power overage

Power headrooms can be utilized to increase clock frequency Power overages need to initiate throttling (clock reduction)

If there is available power headroom CPU clock frequencies can be increased but if the CPU cores are not fully utilized or are inactive the GPU frequency can not be boosted.

## Example for the operation of the Turbo Core technology [12]

- •Within the TDP limit power can be traded between CPU and GPU
- GPU performance is prioritized
- •Can allow combined CPU/GPU thermal allowance which exceeds TDP of the part
- If temp exceeds thermal limit associated with CPU allowance, the CPU performance is reduced to fall back within TDP envelope
- Particularly useful in low-ambient conditions



# 3.3 Microarchitecture enhancements aiming at increasing performance (30)

### Utilizing idle CPU cores or an idle GPU as a heat sink [12]

As idle CPU cores or an idle GPU core behave like a heat sink

idle cores increase the TDP limit of active cores by a factor called the Power Density multiplier (PDM).

Its value depends on the topology of the idle cores and the ambient temperature.

Note: The Turbo Core Manager (TCM) does not raise the TDP limit of the GPU.

Examples for operating the Turbo Core technology in case of idle CPU/GPU cores

# Using GPU power cooling to boost CPU performance [12]



Performance improvement achieved by AMD's Turbo Core [12]



 AMD Turbo Core technology maximizes performance on low-thread count apps 3.4 Microarchitecture enhancements aiming at reducing power consumption

# 3.4 Microarchitecture enhancements aiming at reducing power consumption

### **Overview**

- a) Power-aware clock grid design
- b) Power gating
- c) Core C6 state (CC6 state)
- d) Package C6 state (PC6 state)

# a) Power-aware clock grid design [20], [19]

# Use of clock gating

- Recent processors, like Llano make use of clock gating instead of a straightforward clocking system.
- Clock gating aims at reducing power consumption by disabling the clocking of temporarily not needed circuitry.
- It is implemented by enabling/disabling the clocking of a certain part of a circuitry by a control signal, called the Clock Gater Enable signal (Clk Enable).



# The clock grid-1 [14], [20]

- The clock signal has to be connected to every part of a chip with a low skew (say a few %).
- The typical solution for that is using a clock grid implemented as a metal grid.
- This clock grid needs to be combined with the clock gaters that are responsible for disabling momentarily not needed parts of the microprocessor, as shown below. [14]



Further on, often Delay Buffers are needed to align skew [14].



# The clock grid-2 [21], [14]

- In a large microprocessor over 30 % of the total power consumption can be used to drive the clock grid.
- Designers of Llano
  - restricted the worst-case skew to less than 3 % of the cycle time, and
  - by a careful design of the clock gater placement they achieved a large-scale depopulation of the global clock grid metal and of the clock buffers, as shown in the nex Figure.

# Clock grid design before and after depopulation [21]



By depopulating the clock grid designers achieved

- $\bullet$   $\sim$  80 % reduction in clock grid metal capacitance and
- ~ 50 % reduction in the number of clock buffers.

As a result, the clock grid of Llano consumes less than 10 % of the dynamic power.

# b) Power gating

# Principle of power gating [22], [23]

- Power gating means switching off the power supply for dedicated units of a die (like a core) entirely by switching transistors.
- It can virtually eliminate the leakage component of the power consumption.



• In the Llano processor all major units, like CPU cores, the GPU core, memory controller etc. can be power-gated individually.

# Implementation of power gating-1 [24], [22], [14]

- Llano's power gating implementation is based on Kosonocky's patent application [24].
- In the Llano processor a unit, like a core is power-gated by isolating the ground plane (GND) from the rest of the die.

Then for the power-gated unit the ground plane (GND) is substituted by a virtual ground plane (VGND) that supplies the die [22].



### **Implementation of power gating-2** [14]

- The ground plane (GND) is carrying the voltage VSS whereas the virtual ground plane the voltage VSSCORE.
- Both planes are connected by a ring of NFET switching transistors, that are controlled by the Ring Control.



If switched on both planes are connected by a small resistance of about  $1.1 \text{ m}\Omega$ . If switched off, both planes are disconnected, effectively no leakage current can flow. Remarks to the implementation of power gating in Llano-1 [14], [23]

- 1) The decision to power-gate VSS (ground) is a design decision, instead VDD could also be chosen to power-gate.
- 2) In Llano the power gates are arranged on one side (on the left side in the Figure) in zigzag pattern rather than along a line to increase contact density.

#### Note

The switching transistors (power gates) connect/ isolate the VSS plane (red squares) to/from the VSSCORE plane ( black squares).



Remarks to the implementation of power gating in Llano-2 [14], [23]

3. The NFET switching transistors are built up in fact of two switching transistors, termed as the Small FET and the Large FET.



On power-on, first the Small FETs are enabled allowing the core to power up while limiting the di/dt transient current to reduce effects on neighboring circuits.

Subsequently, the Large FETs are also enabled to provide a low-resistance path (~ 1.1 m $\Omega$ ) to VSS for core operation.

# Remarks to the implementation of power gating in Llano-3 [25]

4) AMD's Bulldozer makes use of the same technology to implement power gating [25]



### Use of power gating beyond CPU cores and L2 caches in Llano [12]

In Llano AMD power-gates virtually all units, such as

- the Graphics Core (GFX)
- the Graphics Memory Control (GMC) (when memory is in self-refresh)
- the Unified Video Decoder (UVD)
- the x16 PCIe Graphics Expansion Controller.



AON: Always on – refers to logic that is not power gated

The GFX and GMC units have dynamic (hardware) power gating whereas power gating of the other units is static, i.e. done under driver control.

The effect of power gating the UVD (Unified Video Decoder) and GFX (GPU) units [18]





# Core C6 state (CC6 state) [23], [12]

#### **Preconditions for entering the CC6 state-1**

- The CC6 state will be initiated by the OS when it requests the Halt state or a deeper C-state.
- The CC6 state is however a deeper C-state with the penalty of having long entry and exit times in the order of tens of µs due to the fact that before entering the CC6 state, i.e. before powering down the core, the core architectural state and the contents of the L2 cache needs to be flushed to main memory and this context needs to be restored before resuming execution.

### **Preconditions for entering the CC6 state-2**

- There is an energy cost for saving and restoring the states of the core and its L2.
- On the other hand interrupts may wake up the system requesting a restoration of the saved context of the core and L2.

Obviously, activating the CC6 state in the presence of a high interrupt activity may lead to a performance degradation and also to a reduction of the performance/power ratio due to the resulting longer interrupt service times.

- Therefore, the Power Management Controller (PMC) of Llano maintains a set of activity monitors, like the Interrupt Rate Tracker (IRT).
  - IRT increments its counter if an interrupt arrives and decrements it at regular, pre-setable time intervals.

Thus the contents of IRT represents the actual interrupt activity.

• PMC will initiate a CC6 state entry if the OS requests the halt state or a deeper C-state and the interrupt activity is lower than a given threshold else PMC will initiate a low power C state with clock-gating.





## The sequences of entering and exiting the CC6 state [14], [12]

#### **CC6 entry sequence**

- L1 and L2 caches are flushed to system memory
- The micro-architectural state of the core is saved in system memory
- CPU clocks are stopped
- The PLL of the CPU is powered down
- The core is powered down by isolating VSSCORE from VSS through power gating.

#### **CC6 exit sequence**

- Power up and lock core PLL
- Power up the core by activating power gating
- Set up clock frequency
- Run reset microcode
- Restore L1 and L2 from the system memory
- Restore the micro-architectural state of the core from the system memory
- Now the core is ready to resume execution
### d) Package C6 state (PC6 state) [12], [23]

### Entry into the PC6 state-1

- When all cores are already power-gated the PMC can consider to reduce VDD to 0 V.
- As PC6 transitions introduce an extra 100-150 µs interrupt service latency due to the need to ramp up VDD to the operational level after waking up from the PC6 state, PMC makes use of an another set of activity monitors, called the Power gating monitors or PC6 monitors, to decide whether or not a transition to the PC6 state can be guessed as beneficial.

#### State transition diagram for the Package C6 state (PC6 State) [23]



### Power gating monitors [12]

- CC6 exit requires approximately 30µs
- PC6 exit can be >100µs (due to VRM power supply restoration)
- Power Gating Policy Control Logic monitors system activity to identify when power gating entry/exit latency will adversely affect performance







### Package C6 state (PC6 state) [12], [23]

### Entry into the PC6 state-2

- If PMC decides that a transition to the PC6 state would contribute to power saving, it initiates the entry into the PC6 state and removes VDD.
- Else it let's the core enter the Pmin state.

In the Pmin state PMC reduces VDD in order to further lower leakage in the power gating structures and the remaining AON (Always ON) circuitry.

### **PC6 State residency while running applications** [23]

### PC6 (Package C6) APU State Residency

|     | BluRay | Youtube | Win Idle | 3DMark06 |
|-----|--------|---------|----------|----------|
| PC6 | 37.98% | 37.13%  | 98%      | 26.7%    |

### Reduction of power consumption while utilizing the PC6 state [14]



(The measurement is taken with one core active and three with the power gate ring activated. The Figure shows the mean power reduction on more than 1300 die from multiple lots).

### Note [18]

- Introducing power gating has an eminent influence on power consumption.
- Without power gating idle cores or units are clock gated.
   Clock gating however, does not reduce leakage currents, so with clock gating alone a considerable dissipation remains due to leakage currents.
- Power gating and associated CC6 as well as PC6 states are also prerequisites for an efficient Turbo Core technology since without implementing these power saving techniques the power headroom is typically too small to activate the Turbo Core technology effectively, i.e. in this case the Turbo Core technology can not contribute for a noteworthy performance boost.

# 3.5 Family 12h (Llano)-based desktop lines

## 3.5 Family 12h (Llano)-based desktop lines

|                                 | Launched in                      | 2008-2009                                                                                | 2011                            |
|---------------------------------|----------------------------------|------------------------------------------------------------------------------------------|---------------------------------|
|                                 |                                  | Family 11h<br>(Griffin)                                                                  | Family 12h<br>(Llano)           |
| Ņ                               | 4P servers                       |                                                                                          |                                 |
| ver                             | 2P servers                       |                                                                                          |                                 |
| Ser                             | 1P servers                       |                                                                                          |                                 |
|                                 | (85-140 W)                       |                                                                                          |                                 |
| SC                              | <b>High perf.</b><br>(~95-125 W) |                                                                                          |                                 |
| sktoj                           | Mainstream<br>(~65-100 W)        |                                                                                          | Llano A8/A6/A4/E2<br>Sempron X2 |
| De                              | <b>Entry level</b><br>(40-60 W)  |                                                                                          |                                 |
| S                               | High perf.<br>(~30-60 W)         | Turion X2 Ultra (ZM-xx)<br>Turion X2 (RM-xx)                                             | Llano A8 M                      |
| book                            | Mainstream/Entry<br>(~20-30 W)   | Athlon X2 (QL-xx)<br>Sempron (SI-xx)                                                     | Llano A6/A4/E2 M                |
| Note                            | Ultra portable<br>(~10-15 W)     | Turion Neo X2 (L6xx)<br>Turion X2 (RM-xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U) |                                 |
|                                 | Tablet (~5 W)                    |                                                                                          |                                 |
| <b>Embedded</b><br>(~10 - 20 W) |                                  | Turion Neo X2 (L6xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U)                      |                                 |

## 3.5 Family 12h (Llano)-based Fusion desktop lines (2)

### Family 12h (Llano)-based desktop lines – Overview [11]



#### **Desktop Fusion APU models**

 Lynx desktop platform (65/100 W)
 Fusion A/E2 series (6/2011-12/2011) 2/4 C, 512 KB/1 MB per C, GPU, 65/100 W

#### **GPU-less desktop models**

Lynx desktop platform (65/100 W)

- Athlon II X4 6xx, (8/2011- 2/2012) 4C, 1 MB/C, 65/100 W
- Athlon II X2 221, (8/2011) 2C, 512 KB/C, 65 W
- Sempron 198, /8/2011) 2C, 512 KB/C, 65 W

GPU disabled, no Turbo Core

Turbo Core on select desktop models

Socket FM1

### Main features of the Family 12h (Llano)-based Fusion A8 high-performance desktop line

| Base arch./<br>stepping                          |                       | Intro             | High<br>perf. DT<br>family | Series          | Techn.    | Core<br>count<br>(up to) | L2<br>(up to) | L3<br>(up<br>to) | Memory<br>(up to)      | HT/ dir.<br>(up to)     | Socket      |
|--------------------------------------------------|-----------------------|-------------------|----------------------------|-----------------|-----------|--------------------------|---------------|------------------|------------------------|-------------------------|-------------|
|                                                  | CG                    | 9/2003            | Claw-<br>Hammer            | Athlon<br>64    | 130<br>nm | 1                        | 1 MB          | -                | DDR-400                | HT 2.0:<br>4.0 GB/s     | 754/<br>939 |
| VO                                               | E4                    | 4/2005            | San<br>Diego               | Athlon<br>64    | 90 nm     | 1                        | 1 MB          | -                | DDR-400                | HT 2.0:<br>4.0 GB/s     | 939         |
| ĸŏ                                               | E6                    | 5/2005            | Toledo                     | Athlon<br>64 X2 | 90 nm     | 2                        | 2*1 MB        | -                | DDR-400                | HT 2.0:<br>4.0 GB/s     | 939         |
|                                                  | E2/E3                 | 5/2006            | Windsor                    | Athlon<br>64 X2 | 90 nm     | 2                        | 2*1 MB        | -                | DDR2-800               | HT 2.0:<br>4.0 GB/s     | AM2         |
| К10                                              | B2<br>B3              | 11/2007<br>3/2008 | Agena                      | Phenom<br>X4    | 65 nm     | 4                        | 4*1⁄2 MB      | 2 MB             | DDR2-1066              | HT 3.0:<br>8.0 GB/s     | AM2+        |
| K10.5                                            | C2<br>C2/C3           | 1/2009<br>2/2009  | Deneb                      | Phenom<br>II X4 | 45 nm     | 4                        | 4*1⁄2MB       | 6 MB             | DDR2-1066<br>DDR3-1333 | HT 3.0:<br>8.0 GB/s     | AM2+<br>AM3 |
|                                                  | E0                    | 4/2010            | Thuban                     | Phenom<br>II X6 | 45 nm     | 6                        | 6*1⁄2MB       | 6 MB             | DDR2-1066<br>DDR3-1333 | HT 3.0:<br>8.0 GB/s     | AM3         |
| Fam. 11                                          | <b>ı</b> (Griffin)    | -                 | -                          | -               | -         | -                        | -             | -                | -                      | -                       | -           |
| Fam<br>(Lla                                      | <b>i. 12h</b><br>ano) | 6/2011            | Llano                      | Fusion<br>A8    | 32 nm     | 4                        | 4*1 M         | -                | DDR3-1866              | UMI:<br>5 GT/s          | FM1         |
| Fam. 14                                          | n (Bobcat)            | -                 | -                          | -               | -         | -                        | -             | -                | -                      | -                       | -           |
| <b>Fam. 15h</b><br>Models 00h-0Fh<br>(Bulldozer) |                       | 10/2011           | Zambezi                    | FX-series       | 32 nm     | 4 CM<br>(8 C)            | 4x2 MB/CM     | 8 MB             | DDR3-1866              | HT 3.1:<br>12.8<br>GB/s | AM3+        |
| Fam. 15h<br>Models 10h-1Fh<br>(Piledriver)       |                       | 10/2012           | Vishera                    | FX-series       | 32 nm     | 4 CM<br>(8 C)            | 4*2 MB/CM     | 8 MB             | DDR3-1866              | HT 3.1:<br>12.8<br>GB/s | AM3+        |
| No furthe<br>base                                | r Fam. 15h<br>d lines | -                 | -                          | -               | -         | -                        | -             | -                | -                      | -                       | -           |

### Family 12h Llano-based desktop Fusion APU lines

|             |                     |    |               |         |           |          | <br>6/11                                                                               |
|-------------|---------------------|----|---------------|---------|-----------|----------|----------------------------------------------------------------------------------------|
|             | Platform<br>segment |    | Core<br>count | L2      | DDR3      | GPU      | Llano<br>228 mm <sup>2</sup> , 1450 mtrs<br>Turbo Core on select models<br>FM1 65/100W |
|             |                     | A8 | 4C            | 4*1MB   | DDR3-1866 | HD 6530D | A8-38xx                                                                                |
| D<br>e<br>s |                     |    | 4C            | 4*1MB   | DDR3-1866 | HD 6530D | A6-36xx                                                                                |
| k<br>t      | Mainstream          | A6 | 3C            | 3*1MB   | DDR3-1866 | HD 6530D | A6-35xx                                                                                |
| o<br>p      |                     | A4 | 2C            | 2*512KB | DDR3-1600 | HD 6410D | A4-3xxx                                                                                |
| S           |                     | E2 | 2C            | 2*512KB | DDR3-1600 | HD 6370D | E2-3200                                                                                |

### Family 12h Llano-based GPU-less desktop lines

|                  |            |           |               |         |           |   | 8/11<br>V                                                                               |
|------------------|------------|-----------|---------------|---------|-----------|---|-----------------------------------------------------------------------------------------|
| Platform segment |            |           | Core<br>count | L2      | DDR3      |   | Llano<br>228 mm <sup>2</sup> , 1450 mtrs<br>GPU disabled, no Turbo Core<br>65/100W, FM1 |
| D<br>e           |            | Athlon II | 4C            | 4*1MB   | DDR3-1866 |   | X4 6xx                                                                                  |
| s<br>k<br>t      | Mainstream |           | 2C            | 2*512KB | DDR3-1600 |   | X2 2xx                                                                                  |
| o<br>p           |            | Sempron   | 2C            | 2*512KB | DDR3-1600 |   | X2 198                                                                                  |
| s                |            | •         |               |         |           | • | 2011                                                                                    |

## 3.5 Family 12h (Llano)-based Fusion desktop lines (6)

### Family 12h (Llano)-based desktop lines on AMD's desktop roadmap [26]



usion



### AMD desktop platform roadmap [33]



### AMD's Llano-based mainstream desktop platform [17]



# 3.6 AMD's Family 12h (Llano)-based Fusion mobile lines

## 3.6 AMD's Family 12h (Llano)-based Fusion mobile lines

|                                 | Launched in                         | 2008-2009                                                                                | 2011                            |
|---------------------------------|-------------------------------------|------------------------------------------------------------------------------------------|---------------------------------|
|                                 |                                     | Family 11h<br>(Griffin)                                                                  | Family 12h<br>(Llano)           |
| Ņ                               | 4P servers                          |                                                                                          |                                 |
| ver                             | 2P servers                          |                                                                                          |                                 |
| Ser                             | 1P servers                          |                                                                                          |                                 |
|                                 | (85-140 W)                          |                                                                                          |                                 |
| SC                              | <b>High perf.</b><br>(~95-125 W)    |                                                                                          |                                 |
| sktol                           | Mainstream<br>(~65-100 W)           |                                                                                          | Llano A8/A6/A4/E2<br>Sempron X2 |
| De                              | <b>Entry level</b><br>(40-60 W)     |                                                                                          |                                 |
| S                               | High perf.<br>(~30-60 W)            | Turion X2 Ultra (ZM-xx)<br>Turion X2 (RM-xx)                                             | Llano A8 M                      |
| book                            | Mainstream/Entry<br>(~20-30 W)      | Athlon X2 (QL-xx)<br>Sempron (SI-xx)                                                     | Llano A6/A4/E2 M                |
| Note                            | <b>Ultra portable</b><br>(~10-15 W) | Turion Neo X2 (L6xx)<br>Turion X2 (RM-xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U) |                                 |
|                                 | Tablet (~5 W)                       |                                                                                          |                                 |
| <b>Embedded</b><br>(~10 - 20 W) |                                     | Turion Neo X2 (L6xx)<br>Athlon Neo X2 (L3xx)<br>Sempron (200U/210U)                      |                                 |

### Positioning of AMD's high performance Family 12h mobile APU lines

| Base arch./<br>stepping                    |        | Intro   | High perf.<br>mobile<br>family | Series              | Techn. | Core<br>count<br>(up to) | L2<br>(up to)                  | L3 | <b>Memory</b><br>(up to) | HT/ dir.<br>(up to)   | Sock<br>et |
|--------------------------------------------|--------|---------|--------------------------------|---------------------|--------|--------------------------|--------------------------------|----|--------------------------|-----------------------|------------|
|                                            | C0, CG | 9/2003  | Claw-<br>hammer                | Mobile<br>Athlon 64 | 130 nm | 1                        | 512 KB                         | -  | DDR-400                  | HT 1.0:<br>3.2 GB/s   | 754        |
| К8                                         | E5     | 3/2005  | Lancaster                      | Turion 64           | 90 nm  | 1                        | 1 MB                           | -  | DDR-400                  | HT 1.0:<br>3.2 GB/s   | 754        |
|                                            | F2     | 5/2006  | Trinidad                       | Turion 64<br>X2     | 90 nm  | 2                        | 2*1⁄2 MB                       | -  | DDR2-667                 | HT 1.0:<br>3.2 GB/s   | S1         |
| K10.5                                      | DA-C2  | 9/2009  | Caspian                        | Turion II           | 45 nm  | 2                        | 2*1⁄2 MB/<br>2*1 MB1           | -  | DDR2-800                 | HT 3.0:<br>7.2 GB/s   | S1g3       |
|                                            | DA-C3  | 5/2010  | Champlain                      | Turion X4           | 45 nm  | 4                        | 4*1⁄2 MB                       | -  | DDR3-<br>1066            | HT 3.0:<br>7.2 GB/s   | S1g4       |
| Fam. 11<br>(Griffin)                       | B1     | 6/2008  | Lion                           | Turion X2<br>Ultra  | 65 nm  | 2                        | 2x½ MB/<br>2*1 MB <sup>2</sup> | -  | DDR2-800                 | HT 3.0:<br>10.4 GB/s  | S1g2       |
| Fam. 12<br>(Llano)                         | B0     | 6/2011  | Llano<br>APU                   | Fusion A8 M         | 32 nm  | 4                        | 4x1 MB                         | -  | DDR3-1600                | -                     | FM1        |
| Family 15h<br>Mod. 10h-1Fh<br>(Piledriver) |        | 5/2012  | Trinity<br>APU                 | A10-A4 M            | 32 nm  | 2 CM<br>(4C)             | 2 MB/CM                        | -  | DDR3-<br>1600            | PCIe 2.0x16<br>UMI    | FS1r2      |
| Family 15h<br>Mod. 20h-2Fh<br>(Piledriver) |        | 2/2013  | Richland APU                   | A10-A4 M            | 32 nm  | 2 CM<br>(4C)             | 2 MB/CM                        | -  | DDR3-1866                | PCIe 2.0x16<br>???UMI | FS1r2      |
| Family 16h<br>(Jaguar)                     |        | H1/2013 | Kabini APU<br>(SOC)            | A/E Series          | 28 nm  | 2 CM<br>(4C)             | 2 MB<br>shared                 | -  | DDR3-<br>1866            | PCIe ???              | BGA        |

<sup>1</sup>: 2\*512 KB for Turion II, 2\*1 MB for Turion II Ultra

<sup>2</sup>: 2\*512 KB for Turion X2, 2\*1 MB for Turion X2 Ultra

UMI: Universal Media Interface

### Family 12h Llano-based mobile APU lines

|             |                 |    |               |         |          | 0/11<br>V                                                                                                                         |
|-------------|-----------------|----|---------------|---------|----------|-----------------------------------------------------------------------------------------------------------------------------------|
|             | Platform segmer | nt | Core<br>count | L2      | GPU      | Llano<br>228 mm <sup>2</sup> , 1480 mtrs<br>DDR3L-1333<br>Up to DDR3-1600<br>Integrated PCIe 2.0 contr.<br>Turbo Core<br>35/45-W- |
|             | Performance     | A8 | 4C            | 4*1MB   | HD 6620G | A8-35xxM                                                                                                                          |
| M<br>O<br>b |                 | A6 | 4C            | 4*1MB   | HD 6520G | A6-34xxM                                                                                                                          |
| i<br>I<br>e | Mainstream      | ۵۵ | 20            | 2*1MB   | HD 6480G | A4-33x0M                                                                                                                          |
| S           |                 |    | 20            | 2*512KB | HD 6480G | A4-3305M                                                                                                                          |
|             | Essential       | E2 | 2C            | 2*512KB | HD 6380G | E2-3000M                                                                                                                          |

#### Family 12h Llano-based mobile APU lines on AMD's mobile roadmap [27]

| Platform<br>Segment | 2011                                                                                                                                                                                                               | 2012                                                                                                                                                                                                                                                                    | 2013                                                                                                                                   |
|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
| rformance           | <ul> <li>"Llano" APU - 35W &amp; 45W</li> <li>4 "Husky" CPU cores with "BeaverCreek"<br/>DirectX® 11 graphics</li> <li>Up to DDR3-1600 at 45W</li> <li>Up to DDR3-1333 at 35W</li> <li>FS1 uPGA package</li> </ul> | "Trinity" A8-Series APU<br>• "Piledriver" CPU cores<br>• "London" graphics<br>• DDR3<br>• FS1r2 uPGA package<br>"Trinity" A6 Series APU                                                                                                                                 | <ul> <li>"Kaveri" APU</li> <li>"Steamroller" CPU<br/>cores</li> <li>Fusion graphics</li> <li>DDR3</li> <li>FS2 uPGA package</li> </ul> |
| am                  | <ul> <li>"Llano" APU - 35W &amp; 45W</li> <li>4 "Husky" CPU cores with "BeaverCreek"<br/>DirectX® 11 graphics</li> <li>Up to DDR3-1600 at 45W</li> <li>Up to DDR3-1333 at 35W</li> <li>FS1 uPGA package</li> </ul> | <ul> <li>"Piledriver" CPU cores</li> <li>"London" graphics</li> <li>DDR3</li> <li>FS1r2 uPGA package</li> <li>"Trinity" A4-Series APU "Wichita"</li> </ul>                                                                                                              | "Kabini" APU                                                                                                                           |
| Mainstre            | <ul> <li>"Llano" APU - 35W &amp; 45W</li> <li>2 "Husky" CPU cores with "WinterPark"<br/>DirectX® 11 graphics</li> <li>Up to DDR3-1333 at 35W &amp; 45W</li> <li>FS1 uPGA package</li> </ul>                        | <ul> <li>"Piledriver" CPU cores</li> <li>"London" graphics</li> <li>DDR3</li> <li>FS1r2 uPGA package</li> <li>"Trinity" E2-Series APU</li> <li>"Piledriver" CPU cores</li> <li>"Yuba" FCH</li> <li>DDR3</li> <li>"Yuba" FCH</li> <li>DDR3</li> <li>"Star BCA</li> </ul> | <ul> <li>"Jaguar" CPU<br/>cores</li> <li>Fusion graphics</li> <li>Integrated FCH</li> <li>DDR3</li> <li>FT2 BGA<br/>package</li> </ul> |
| ntial               | "Llano" APU – 35W<br>• 2 CPU cores with "WinterPark"<br>• DDR3-1333, FS1 uPGA package                                                                                                                              | DDR3 package     FS1r2 uPGA package                                                                                                                                                                                                                                     |                                                                                                                                        |
| Esse                | <ul> <li>"Zacate" APU – 18W</li> <li>2 and 1 "Bobcat" CPU core(s)</li> <li>"Loveland" DirectX® 11 graphics</li> <li>DDP2 1055 ET1 BCA package</li> </ul>                                                           |                                                                                                                                                                                                                                                                         |                                                                                                                                        |
| <12"                | "Ontario" APU – 9W<br>• 2 and 1 "Bobcat" CPU core(s)<br>• "Loveland" DirectX® 11 graphics<br>• Up to DDR3/DDR3L-1066<br>• FT1 BGA package                                                                          | <ul> <li>"Krishna" APU</li> <li>"Bobcat" CPU cores</li> <li>"London" graphics, "Yuba" FCH</li> <li>DDR3</li> <li>FT2 BGA package</li> </ul>                                                                                                                             |                                                                                                                                        |
| Ultra Low<br>Power  | <ul> <li>"Desna" APU – 5.9W</li> <li>2 "Bobcat" CPU core(s)</li> <li>"Loveland" DirectX® 11 graphics</li> <li>Up to DDR3/DDR3L-1066</li> <li>FT1 BGA package</li> </ul>                                            | <ul> <li>"Hondo" APU – 4.5W</li> <li>2 "Bobcat" CPU core(s)</li> <li>"Loveland" DirectX® 11 graphics</li> <li>Up to DDR3/DDR3L-1066</li> <li>FT1 BGA package</li> </ul>                                                                                                 | "Samara" APU<br>• "Jaguar" CPU core(s)<br>• Fusion graphics<br>• DDR3/DDR3L<br>• BGA package                                           |

Note: Processor features and schedule are preliminary and subject to change without notice.

#### Mobile Client Solutions July 2011

### The Llano-based Sabine mainstream notebook platform [34]



### Sabine mainstream notebook platform with optional HD 6xxxM discrete graphics [15]



# 4. References

- [1]: Wikipedia, List of AMD mobile microprocessors, http://en.wikipedia.org/wiki/List\_of\_AMD\_ mobile\_microprocessors
- [2]: Goto H., AMD CPU Transition, 2011, http://pc.watch.impress.co.jp/video/pcw/docs/473/823/p7.pdf
- [3]: Owen J., Next-Generation Mobile Computing: Balancing Performance and Power Efficiency, Hot Chips 19, 2007, http://hotchips.org/uploads/hc19/3\_Tues/HC19.08/HC19.08.02.pdf
- [4]: Wikipedia, Turion, http://en.wikipedia.org/wiki/Griffin\_(processor)#Turion\_X2\_Ultra
- [5]: AMD's CPU Roadmap, 2008-2011, Firing Squad, http://www.firingsquad.com/hardware/amd\_cpu\_roadmap\_update\_2008/
- [6]: Sandhu T., AMD's Puma platform set to kick Centrino into touch?, Hexus, June 4 2008, http://hexus.net/tech/features/laptop/13529-amd039s-puma-platform-set-kick-centrinotouch/?page=3
- [7]: AMD Sempron 200U and 210U Processors for Embedded Applications, 2008, http://www.amd.com/us/Documents/45626B\_Sempron\_BGA\_brief.pdf
- [8]: Introducing AMD Turion Neo X2 and AMD Athlon Neo X2 Dual-Core ASB1 (BGA) Processors for Embedded Applications, 2009, http://www.amd.com/us/Documents/47413A\_Dual\_Core\_BGA\_brief\_PDF.pdf
- [9]: AMD A-Series APU, EMEA Press Call, June 7 2011, http://img.zwame.pt/nemesis11/Amd\_A\_series/AMD.pdf
- [10]: Wikipedia, AMD Vision, http://en.wikipedia.org/wiki/AMD\_Vision

- [11]: Wikipedia, List of AMD Fusion microprocessors, http://en.wikipedia.org/wiki/List\_of\_AMD\_Fusion\_microprocessors
- [12]: Foley D., AMD's "LLANO" Fusion APU, Hot Chips 23, Aug. 19 2011, http://www.hotchips.org/archives/hc23/HC23-papers/HC23.19.9-Desktop-CPUs/ HC23.19.930-Llano-Fusion-Foley-AMD.pdf
- [13]: Shimpi A. L., The AMD A8-3850 Review: Llano on the Desktop, Anand Tech, June 30 2011, http://www.anandtech.com/show/4476/amd-a83850-review
- [14]: Jotwani R., Sundaram S., Kosonocky S., Schaefer A., Andrade V. F., Novak A., Naffziger S., An x86-64 Core in 32 nm SOI CMOS, IEEE Journal of Solid-State Circuits, Vol. 46, No. 1, Jan. 2011, http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05624589
- [15]: Altavilla D., AMD Fusion: A8-3500M A-Series Llano APU Review, Hot Hardware, June 14 2011, http://hothardware.com/Reviews/AMD-Fusion-A83500M-ASeries-Llano-APU-Review/?page=2
- [16]: A Nagy AMD Llano APU Megateszt, Pro Hardver, Aug. 1 2011, http://prohardver.hu/teszt/amd\_llano\_apu\_megateszt/hammertol\_huskyig.html
- [17]: Chiappetta M., AMD A8-3850 Llano APU and Lynx Platform Preview, Hot Hardware, June 30 2011, http://hothardware.com/Reviews/AMD-A83850-Llano-APU-and-Lynx-Platform-Preview/
- [18]: Walton J., Shimpi A. L., The AMD Llano Notebook Review: Competing in the Mobile Market, AnandTech, June 14 2011, http://www.anandtech.com/show/4444/amd-llanonotebook-review-a-series-fusion-apu-a8-3500m/4

- [19]: Naffziger S. D., Sampling chip activity for real time power estimation, Patent Genius Aug. 30 2011, http://www.patentgenius.com/patent/8010824.html
- [20]: Silcott G., AMD Talks-Up "Llano" x86 Innovation at ISSCC, Febr. 8 2010, http://blogs.amd.com/fusion/2010/02/08/amd-talks-llano-x86-innovation-isscc/
- [21]: Shimpi A. L., AMD Reveals More Llano Details at ISSCC: 32nm, Power Gating, 4-cores, Turbo?, Anand Tech, Febr. 8 2010, http://www.anandtech.com/show/2933
- [22]: Kosonocky S., Practical Power Gating and Dynamic Voltage/Frequency Scaling, Aug. 17 2011, http://hotchips.org/uploads/hc23/HC23.17.1-tutorial1/HC23.17.111. Practical\_PGandDV-Kosonocky-AMD.pdf
- [23]: Branover A., Foley D., Steinman M., AMD Fusion APU: Llano, IEEE Micro, 2012, http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=06138843
- [24]: Kosonocky S., Patent Application Publication, No. US 2011/0186930 A1, Aug. 4 2011
- [25]: White S., High-Performance Power-Efficient X86-64 Server and Desktop Processors, Using the core codenamed "Bulldozer", Aug. 19 2011, http://hotchips.org/uploads/ hc23/HC23.19.9-Desktop-CPUs/HC23.19.940-Bulldozer-White-AMD.pdf
- [26]: AMD to Introduce Three New Bulldozer-based APUs in 2012, Softpedia, http://news.softpedia.com/newsImage/AMD-to-Introduce-Three-New-Bulldozerbased-APUs-in-2012-6.jpg/
- [27]: Hugosson J., AMD's mobile roadmap up till 2013, early launch of Trinity confirmed, Nordic Hardware, Sept. 21 2011, http://www.nordichardware.com/news/69-cpu-chipset/ 44214-amds-mobile-roadmap-up-till-2013-early-launch-of-trinity-confirmed.html

- [28]: Bennett K., AMD's Griffin Processor & Puma Mobile Platform, Hardocp, May 18 2007, http://www.hardocp.com/article/2007/05/18/amds\_griffin\_processor\_puma\_ mobile\_platform/
- [29]: Bemutatta következő mobil platformját az AMD, ProHardver, May 18 2007, http://prohardver.hu/hir/bemutatta\_kovetkezo\_mobil\_platformjat\_az\_amd.html
- [30]: Owen J., Steinman M., Northbridge Architecture of AMD's Griffin Microprocessor Family, IEEE Micro, March-April 2008, http://www.mzahran.com/masr/arch/papers/amd\_griffin.pdf
- [31]: Crothers B., AMD 'Yukon' looks beyond Netbooks, Cnet, Nov. 13 2008, http://news.cnet.com/8301-13924\_3-10096494-64.html
- [32]: Lenovo ThinkPad Edge 0221-RY6 schematic, Jan. 23 2013, http://notebookschematic.com/?p=22305
- [33]: Arun N., AMD Roadmap 2012 Corona, Virgo and Deccan Platforms, LensFire, http://lensfire.in/19812/news/amd-roadmap-2012-corona-virgo-and-deccan-platforms-21656,
- [34]: Shimpi A.L., AMD's 2010 2011 Roadmaps: ~1B Transistor Llano APU, Bobcat and Bulldozer AnandTech, Nov. 11 2009, http://www.anandtech.com/show/2871/4