Crash Analysis Series: An exploitable bug on Microsoft Teams ?! A Tale of One Bit

By ZecOps Research Team
| November 24, 2020

SHARE THIS ARTICLE

Follow zecops

This is a story about a Microsoft Teams crash that we investigated recently. At first glance, it looked like a possible arbitrary code execution vulnerability, but after diving deeper we realized that there’s another explanation for the crash.

TLDR;

ZecOps ingested and analyzed an event that seems exploitable on a Windows machine from Microsoft Teams
This machine has a lot of other anomalies
ZecOps verifies anomalies such as: blue screens, sudden crashes, mobile restarts without clicking on the power button; and determines if they are related to cyber attacks, software/hardware issues, or configuration problems.
Spoiler alert (text beneath the black highlight):

After further analyzing the crash, we realized that the faulty hardware was causing this exploitable event to appear, and not related to an intentional attack. We suspect that a bit flip was caused due to a bad hardware component.

Business impact: Hardware problems are more common than we think. Repeating faulty hardware-issues lead to continuous loss of productivity, context-switches, and IT/Cyber disruptions. Identifying faulty hardware can save a lot of time. We recommend using the freely available and agent-less tool ZOTOMATE to identify what is SW/HW problems. ZecOps is leveraging machine-learning and its mobile threat intelligence, mobile DFIR, as well as endpoints and servers crash analysis solution, and mobile apps crash-analysis to perform such analysis at scale.

The crash

Looking at the call stack, we saw that the process crashed due to a stack overflow:

 # Child-SP          RetAddr               Call Site
00 000000e5`84600f20 00007ffd`9048ebbc     ntdll!RtlDispatchException+0x3c
01 000000e5`84601150 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x5c
02 000000e5`846016f0 00007ffd`9048ebbc     ntdll!RtlDispatchException+0xa5cba
03 000000e5`84601e20 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x5c
04 000000e5`846023c0 00007ffd`9048ebbc     ntdll!RtlDispatchException+0xa5cba
05 000000e5`84602af0 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x5c
06 000000e5`84603090 00007ffd`9049350e     ntdll!RtlDispatchException+0xa5cba
07 000000e5`846037c0 00007ffd`9048eb73     ntdll!KiUserExceptionDispatch+0x2e
08 000000e5`84603f60 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x13
09 000000e5`84604500 00007ffd`9048ebbc     ntdll!RtlDispatchException+0xa5cba
0a 000000e5`84604c30 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x5c
0b 000000e5`846051d0 00007ffd`9048ebbc     ntdll!RtlDispatchException+0xa5cba
[307 more pairs of RtlRaiseStatus and RtlDispatchException]
272 000000e5`846fb670 00007ffd`9049a49a     ntdll!RtlRaiseStatus+0x5c
273 000000e5`846fbc10 00007ffd`9049350e     ntdll!RtlDispatchException+0xa5cba
274 000000e5`846fc340 00007ff7`8b93338a     ntdll!KiUserExceptionDispatch+0x2e
275 000000e5`846fcad0 00007ff7`8b922e4d     Teams!v8_inspector::V8StackTraceId::ToString+0x39152a
[More Teams frames...]
2c1 000000e5`846ffa40 00007ffd`9045a271     kernel32!BaseThreadInitThunk+0x14
2c2 000000e5`846ffa70 00000000`00000000     ntdll!RtlUserThreadStart+0x21

It can be seen from the call stack that the original exception occurred earlier, at address 00007ff7`8b93338a. Due to an incorrect exception handling, the RtlDispatchException function raised the STATUS_INVALID_DISPOSITION exception again and again in a loop, until no space was left in the stack and the process crashed. That’s an actual bug in Teams that Microsoft might want to fix, but it manifests itself only when the process is about to crash anyway, so that might not be a top priority.

The original exception

To extract the original exception that occurred on address 00007ff7`8b93338a, we did what Raymond Chen suggested in his blog post, Sucking the exception pointers out of a stack trace. Using the .cxr command with the context record structure passed to the KiUserExceptionDispatcher function, we got the following output:

0:000> .cxr 000000e5846fc340
rax=00005f5c70818010 rbx=00005f5c70808010 rcx=000074e525bf0e08
rdx=0000006225f01970 rsi=000016b544b495b0 rdi=0001000000000000
rip=00007ff78b93338a rsp=000000e5846fcad0 rbp=0000000000000009
 r8=000000e5846fcb68  r9=00000000ff000000 r10=0000000000ff0000
r11=000000cea999e331 r12=00005f5c70808010 r13=0000000000001776
r14=0000006225f01970 r15=000000e5846fcaf8
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
Teams!v8_inspector::V8StackTraceId::ToString+0x39152a:
00007ff7`8b93338a 488b07          mov     rax,qword ptr [rdi] ds:00010000`00000000=????????????????

The original exception was triggered by accessing an invalid pointer of the value 00010000`00000000. Not only does the pointer look invalid, It’s actually a non-canonical address in today’s hardware implementations of x86-64, which means that it can’t ever be allocated or become valid. Next, we looked at the assembly commands below the crash:

0:000> u
Teams!v8_inspector::V8StackTraceId::ToString+0x39152a:
00007ff7`8b93338a 488b07          mov     rax,qword ptr [rdi]
00007ff7`8b93338d 4889f9          mov     rcx,rdi
00007ff7`8b933390 ff5008          call    qword ptr [rax+8]
[...]

Very interesting! If we can control the rdi register at this point of the execution, that’s a great start for arbitrary code execution. All we need to control the instruction pointer is to be able to build a fake virtual table, or to use an existing one, and the lack of support for Control Flow Guard (CFG) makes things even easier. As a side note, there’s an issue about adding CFG support which is being actively worked on.

At this point, we wanted to find answers to the following questions:

How can this bug be reproduced?
What source of input can trigger the bug? Specifically, can it be triggered remotely?
To what extent can the pointer be controlled?

The original exception stack trace

In order to try and reproduce the crash, we needed to gather more information about what was going on when the exception occurred. We checked the original exception stack trace and got the following:

0:000> k
 # Child-SP          RetAddr               Call Site
00 000000e5`846fcad0 00007ff7`8b922e4d     Teams!v8_inspector::V8StackTraceId::ToString+0x39152a
01 000000e5`846fcb40 00007ff7`8b92f29b     Teams!v8_inspector::V8StackTraceId::ToString+0x380fed
02 000000e5`846fcbc0 00007ff7`8b92f21f     Teams!v8_inspector::V8StackTraceId::ToString+0x38d43b
03 000000e5`846fcc00 00007ff7`8b9308c0     Teams!v8_inspector::V8StackTraceId::ToString+0x38d3bf
04 000000e5`846fcc80 00007ff7`8b064123     Teams!v8_inspector::V8StackTraceId::ToString+0x38ea60
05 000000e5`846fce10 00007ff7`8b08b411     Teams!v8::Unlocker::~Unlocker+0xf453
06 000000e5`846fce60 00007ff7`8b088f16     Teams!v8::Unlocker::~Unlocker+0x36741
07 000000e5`846fd030 00007ff7`8b087eff     Teams!v8::Unlocker::~Unlocker+0x34246
08 000000e5`846fd190 00007ff7`8b053b79     Teams!v8::Unlocker::~Unlocker+0x3322f
09 000000e5`846fd1c0 00007ff7`8b364e51     Teams!v8::Unwinder::PCIsInV8+0x22059
0a 000000e5`846fd2e0 00007ff7`8b871abd     Teams!v8::internal::TickSample::print+0x54071
0b 000000e5`846fd3d0 00007ff7`8b84e3b8     Teams!v8_inspector::V8StackTraceId::ToString+0x2cfc5d
0c 000000e5`846fd420 00005ebc`b2fdc6a9     Teams!v8_inspector::V8StackTraceId::ToString+0x2ac558
0d 000000e5`846fd468 00007ff7`8b800cb8     0x00005ebc`b2fdc6a9
[More Teams frames...]
4c 000000e5`846ffa40 00007ffd`9045a271     kernel32!BaseThreadInitThunk+0x14
4d 000000e5`846ffa70 00000000`00000000     ntdll!RtlUserThreadStart+0x21

It can be deduced from the large offsets that something is wrong with the symbols, as Raymond Chen also explains in his blog post, Signs that the symbols in your stack trace are wrong. In fact, Teams comes with no symbols, and there’s no public symbol server for it, so the symbols we see in the stack trace are some of the few functions exported by name. Fortunately, Teams is based on Electron which is open source, so we were able to match the Teams functions on the stack to the same functions in Electron. At first, we tried to do that with a binary diffing tool, but it didn’t work so well due to the executable/symbol files being so large (exe – 120 MB, pdb – 2 GB), so we ended up matching the functions manually.

Here’s what we got after matching the symbols:

 # Call Site
00 WTF::WeakProcessingHashTableHelper<...>::Process
01 blink::ThreadHeap::WeakProcessing
02 blink::ThreadState::MarkPhaseEpilogue
03 blink::ThreadState::AtomicPauseMarkEpilogue
04 blink::UnifiedHeapController::TraceEpilogue
05 v8::internal::GlobalHandles::InvokeFirstPassWeakCallbacks
06 v8::internal::Heap::CollectGarbage
07 v8::internal::Heap::CollectGarbage
08 v8::internal::Heap::HandleGCRequest
09 v8::internal::StackGuard::HandleInterrupts
0a v8::internal::Runtime_StackGuard
0b v8::internal::compiler::JSCallReducer::ReduceArrayIteratorPrototypeNext
0c Builtins_ObjectPrototypeHasOwnProperty
[...]

WTF was indeed our reaction when we saw where the exception occurred (which, of course, means Web Template Framework).

From what we can see, the hasOwnProperty object method was called, at which point the garbage collection was triggered, and the invalid pointer was accessed while processing one of its internal hash tables. Could it be that we found a memory bug in the V8 garbage collection? We believed it to be quite unlikely. And if so, how do we reproduce it?

Switching context

At this point we put the Teams crash on hold and went on to look at the other crashes which occurred on the same computer. Once we did that, it all became clear: it had several BSODs, all of the type MEMORY_CORRUPTION_ONE_BIT, indicating a faulty memory/storage hardware. And looks like that’s exactly what happened in the Teams crash: the faulty address was originally a NULL pointer, but because of a corrupted bit it became 00010000`00000000, causing the exception and the crash.

Conclusion

The conclusion is that the relevant computer needs to have its faulty hardware replaced, and of course there’s nothing wrong with V8’s garbage collection that has anything to do with the crash. That’s yet another reminder that hardware problems can cause various anomalies that are hard to explain, such as this Teams crash or crashing at the xor eax, eax instruction.

Hear the news first

Only essential content
New vulnerabilities & announcements
News from ZecOps Research Team

We won’t spam, pinky swear 🤞

ZecOps Mobile XDR is here, and its a game changer

Perform automated investigations in minutes to uncover cyber-espionage on smartphones and tablets.

LEARN MORE >

Partners, Resellers, Distributors and Innovative Security Teams

ZecOps provides the industry-first automated crash forensics platform across devices, operating systems and applications.

LEARN MORE >