TTP
During the past few months, we encountered a handful of RTF and Visual Basic Scripts as attachments in emails. Targets of these crafted emails were Asian and European countries and a common theme/subject of these emails is a shipment or shipping order.
Analysis from Varist’s Hybrid Analyzer reports malicious indicators, which we can use as reference while digging through the installed components.
Infection Chain Overview
VBS
For this campaign, we will just be focusing on the VB script file. Usually, the attachment is compressed using 7zip. Same as the email, it needs to be aligned with the theme so it will have a filename related to the email content which was shipment/shipping order. Checking the contents of the VB script file, the first few lines indicates that it will execute an obfuscated command line.
Stage 1 Powershell
Retrieving the contents of the variable “Kupeens”, it is equal to “owershell” forming “Powershell” at the beginning of the command. With this, we now know that the other variables contains a Powershell Script. Getting the contents of “Bindsel” reveals what seems to be an obfuscated script.
The original script was obfuscated by inserting additional strings. This is removed at runtime by using the Substring function. For the sake of ease in understanding what the code does, we cleaned it up a bit to reveal its behavior.
The code below is the cleaned up version after applying the function named “Fradmmtes”. It will do the following:
- It will try to download a file from the following URL:
- hxxp[:]//194[.]59[.]31[.]137/Wattest.pfb
- hxxp[:]//94[.]156[.]8[.]88/Wattest.pfb
- The file is saved as %appdata%\Tvangsarbejde.Afv
- Payload contains the obsfuscated stage 2 powershell script,stage 1 shellcode,stage 2 shellcode
- The contents of the file is read, base64 decoded and saved to $global:nyhavn
- Using Substring, It will get the next stage of powershell script and execute it via the “Dactyliographer” function
Stage 2 Powershell
The next stage of the Powershell script also relies on the downloaded file. From the stage 1 powershell script, $Festmiddagenes contains the script which is also encrypted upon retrieval and then the stage 2 is executed via the Dactyliographer function. This 2nd stage script also uses the “Fradmmtes” function for deobfuscation. It also added a lot of comments to the script.
The code below makes sure that the script will be running under the 32bit copy of Powershell.exe. It checks $Sdsuppers which was set to “True” in the stage 1 script while trying to download the payload or check the size of the IntPtr to determine if it is running in a 32bit or 64bit machine. It will adjust the directory of the Powershell to be executed making sure it will be running the 32bit copy,restart the stage 1 script and exit.
Aside from the function “Fradmmtes” , it also deobfuscates its code with the function “Fodende”. It converts the string from hex to bytes. XOR is performed on each byte using the value of 26. The value of $Outwoman was always set to 0 so it will always return the converted strings.
Overall, the script is very similar to the Metasploit Powershell Reflection Payload. The final part of the script will do the following:
- Try to hide its active window
- Allocate memory spaces for the stage 1 shellcode with the size of 0x98C2 and stage 2 shellcode with the size of 0x4CBE000.
- Copy the stage 1 shellcode from the start of the downloaded payload which has a size of 0x98C2 to the unmanaged memory pointer
- Copy the stage 2 shellcode +0x98C2 from the start of the downloaded payload which has a size of 0x47B9F to the unmanaged memory pointer.
- Use the stage 1 shellcode as the start of the callback function for CallWindowProcA. Also putting the start address of the stage 2 shellcode and the address of NtProtectVirtualMemory as parameters.
Stage 1 Shellcode
The stage 1 shellcode is filled with garbage code and jump instructions. The size of the shellcode copied in memory is 0x98c2 in size but the actual used code is approximately less than 0x200 bytes.
The cleaned up code is less than 100 lines of instructions. It will do the following:
- Fill random bytes at 0x16629C from the start of the stage 2 payload. The size of the random bytes is 0x04B27024.
- Copy 0x47FA0 bytes from start\base address of stage 2 shellcode to start\base address+0x19602E4 of stage 2 shellcode .
- Make an Indirect Syscall for NtProtectVirtualMemory. The SSN number of NtProtectVirtualMemory is computed by searching for the function after it which is NtQuerySection. Getting the SSN value of NtQuerySection and subtracting 1 which is relying on the sequential order of the SSN. By passing possible API hooking.
- Decrypt the bytes at +0x19602E4, which will be the EIP of the stage 2 shellcode, using XOR.
- Call to start the stage 2 shellcode.
Stage 2 Shellcode
The shellcode contains multiple anti-analysis and anti-debugging techniques. One of them is using the Vectored Exception Handling(VEH) which is commonly used by GULOADER. This technique makes analyzing/reversing it a bit difficult as it will break the control flow of analysis tools.
VEH
The VEH function is frequently triggered throughout the code. This is registered via RtlAddVectoredExcepitionHandler API function. With the use of this VEH function, it will control the flow of the shellcode. Everytime it triggers an exception, it will check if it is one of the following:
- It monitors/triggers the following exceptions:
- STATUS_ACCESS_VIOLATION
- STATUS_ILLEGAL_INSTRUCTION
- STATUS_PRIVILEGED_INSTRUCTION
- STATUS_SINGLE_STEP
- STATUS_BREAKPOINT
It will check for Hardware Breakpoints, set EAX to 0 if there is any and break the flow of the VEH function.If no breakpoints is found, it will proceed to its normal flow which is adjusting the EIP of the thread. To do this, It will get the thread EIP by retrieving the Context Record. Once the thread EIP is retrieved, it will get the byte value from thread EIP+6 and xor this byte by 0x34. The resulting value will be added to the thread EIP updating the Context Record. When the VEH handler is done and return the execution to the shellcode, it will continue in this updated EIP.
- eip=((ReadByte(eip+6)^0x34)+eip)
Dynamic API Resolving
To get the address of the API it will use, it uses a modified DJB2 hashing algorithm to compute the hash based on the API name that it needs. It added an XOR in getting the hash. Here is the equivalent python code for computing the hash.
The function for getting the API addresses needs 2 parameters. It needs the pointer to the module name of where the API is exported and the precomputed DJB2 hash of the API name.
But before we dig into how it resolves the API’s, we need to discuss first how it resolves LdrLoadDll since it will rely on it in trying to load the modules and try to bypass Endpoint Detection and Response(EDR) hooking on this API. This is done the first time this function runs.
First, It needs the precomputed DJB2 hash of “NTDLL.DLL”(all caps) to get the base address of ntdll.dll by traversing the Process Environment Block(PEB) which will be discussed below. Next is use the base address, which is saved in EAX, to search for LdrLoadDll API by traversing the exported function of the module.
For getting the base address of ntdll.dll, it needs access to the Process Environment Block(PEB). This is done via Thread Environment Block(TEB) using the fs segment register. fs[0x30] contains the pointer to PEB. Here’s the summary of the steps based on the code:
- Get the PEB address via TEB
- Get the pointer to PEB_LDR_DATA structure
- Get the pointer to InMemoryModuleList
- save the base address of the module
- get the DLL name, convert to uppercase and apply the DJB2 hash
- compare the hash and if its correct, return the base address of ntdll.dll in eax
Next step is resolving LdrLoadDll. This is done by traversing the PE header using the base address of ntdll.dll from the previous function. Here’s the summary of the function:
- Get the Export Directory Table Address via PE header
- Get the Export Address Table
- Saving Number of Name Pointers, Export Address Table RVA and Ordinal Table RVA for later use in resolving the API
- Get Name Pointer Table Address and get the pointer for the API export name
- Compute the DJB2 hash of the API export name
- Compare it to the searched API DJB2 hash
- If found, return the computed for the Address of the searched API in ESI. If not equal, move to the next pointer for the API export name
- Save an adjusted address LdrLoadDll. Used for Indirect Call to API.
Now that the “first run” set up by resolving the LdrLoadDll is done, the function for getting the API could be summarized by the following:
- Load the module using an Indirect call to LdrLoadDll using the module name as parameter. (The module name is decrypted in runtime and the decryption of the string will be discussed in the next section.)
- Traverse the Exported functions of the loaded module 1 at a time. Getting the equivalent DJB2 hash of the API name and comparing it to the API hash it needs.( This is the same function used to resolve the API address of LdrLoadDll.)
- Once found, it will return the computed API address in EAX.
API Calls and more
It also has a function when calling some of its API. It uses the resolved API by the function discussed earlier. But before it makes a call to that API, it will try to encrypt its code and check if there’s some kind of API hooking by analysis tools or EDR in it. Here is the summary:
- Get the functions start address
- Get the start address of the shellcode and compute for the address just before it. This will be the start and end address of the code it will encrypt.
- Encrypt the shellcode base on the start and end address using the return address of the function as XOR key.
- Check if the are any API hooking by checking the byte/word at the start of API function. If there are any, it will clear the stack/frame pointer before executing the API. If there is none, it will execute the API as usual and decrypt the shellcode back to original.
Dynamic String Decryption
Each time it will need to use a string, it will decrypt it at runtime. It involves 3 functions to decrypt the strings.
- Allocate memory for the encrypted strings using NtAllocateVirtualMemory
- Copy the encrypted strings to the allocated space
- Decrypt the strings
When it allocates a space for the encrypted string, it uses the same function for resolving API’s to resolve for NtAllocateVirtualMemory. It will use this as the parameter for the function to allocate memory space. It will fill the first 0x7F4 bytes and return the address allocated space+0x7F4 which will be the starting address where the encrypted bytes will be saved.
Below is the code snippet for copying the encrypted string to the allocated memory. It uses a combination of algorithmic and arithmetic instructions(code obfuscation that can be found frequently in its code). The first double word it copies is the length of the encrypted string. Then it will start copying the encrypted string next to it in a similar fashion.
Here is a sample dump of an encrypted string in the allocated memory:
At the start of the decryption, it will patch the length of the encrypted string after saving it to a variable. The main decryption function needs 4 parameters which is the start of the encrypted strings, encrypted string length, key length, address of the key.
Here is the dump of the decrypted strings with the its length patched:
more Anti-Analysis
Before it proceeds to injecting the stage 3 shellcode, it will perform several more anti-analysis.
- Scans the virtual memory pages using NtQueryVirtualMemory for virtual machine related strings or bytes. if a string/bytes is found, it computes the DJB2 hash and compare it to its pre computed list of hashes and length of the string/bytes should be also equal. It will also display a message box before terminating the process.
- Patch API’s
- LdrLoadDll – restore prologue.deleting possible EDR hooks.
- DbgBreakPoint – modify int3 to nop
- DbgUiRemoteBreakin – modified to execute ExitProcess
- Check the number of windows opened. It will terminate the process if it is is less than 0xC = 12.
- Using NtSetInformationThread and setting the ThreadInformationClass parameter with ThreadHideFromDebugger = 0x11 which detaches the debugger.
- Check if it is being debugged using NtQueryInformationProcess and setting the ProcessInformationClass parameter with ProcessDebugPort = 0x07. A nonzero return value indicates that it is under a debugger.
- After this anti-debug check, it copies the assembly instruction for NtAllocateVirtualMemory
- Enumerate the device drivers by using EnumDeviceDrivers and get the associated name using GetDeviceDriverBaseNameA. Then compute the DJB2 hash of the name and compare it to its pre computed list of hashes.
- 8F405F39
- CF9ABCE7
- vmmouse.sys = 8FB74F31
- vmusbmouse.sys = 935C0AAE
- 75F6DBF8
- vm3dmp.sys = C3E0D12B
- Enumerate installed products by using MsiEnumProductsA and getting the product name using MsiGetProductInfoA.
- 7A8759
- 8CCF1F16
- 32CE3716
- A3CDAB8B
- Enumerate services by using OpenSCManagerA and getting the service name using EnumServicesStatusA.
- VMware Tools = D4BB5966
- VMware Snapshot Provider = 7410B0FC
- ACCAA09
- D8A1A01B
- 45E5272C
- B75128CB
- FF6DC891
- RDTSC and CPUID to detect if its running in a virtual environment
Process Hollowing
After all the anti-analysis and preparation, it is now ready to inject the stage 3 shellcode.
- Create a suspended process using CreateProcessInternalW
- C:\Program Files (x86)\windows mail\wab.exe”
- NtOpenFile to get a handle for C:\\WINDOWS\\syswow64\\mshtml.dll
- Use handle to create an image section object using NtCreateSection with SEC_IMAGE attribute.
- Map the section using NtMapViewOfSection to allocate the space for the stage 3 shellcode. It checks the following error status:
- STATUS_NO_MEMORY – it will try to execute NtMapViewOfSection again. If the error continues after a certain number of tries. It will close to handles and terminate the suspended process.
- STATUS_CONFLICTING_ADDRESSES / STATUS_MAPPED_ALIGNMENT – It will close to handles and terminate the suspended process.
- STATUS_IMAGE_NOT_AT_BASE – it will make a Direct Syscall of NtAllocateVirtualMemory. The original assembly instruction of NtAllocateVirtualMemory was copied from the already loaded ntdll.dll to its memory space.
- Now that the space is ready, it will inject the same stage 2 shellcode using NtWriteVirtualMemory
- To prepare the stage 3 shellcode, it retrieves the Context using NtGetContextThread. It will also get the address of RtlUserThreadStart.
- It will modify the EBX register of the Context to point to the Entry Point of the stage 3 shellcode. It will also modify the EIP to address of RtlUserThreadStart+0x1B. This EIP modification will put the start instruction saving the value of EBX to [ESP+4]
- After these modifications were done, it will apply these changes using NtSetContextThread.
- Then it will also set the value of [ESP+4] to the Entry Point of the stage 3 shellcode before triggering the start of the process using NtResumeThread.
Stage 3 Shellcode
Although the stage 2 and stage 3 shellcode is the same, the purpose of the shellcode changes. It will now try to download the final payload. It will executes the anti-analysis mentioned in the stage 2 shellcode.
- It will decrypt the URL’s that it will try to download the final payload
- hxxp://194[.]59[.]31[.]137/QoNGqRlihlEHmyvHbhC131[.]bin
- hxxp://94[.]156[.]8[.]88/QoNGqRlihlEHmyvHbhC131[.]bin
- Resolve the following API’s
- InternetOpenA
- InternetSetOptionA
- InternetOpenUrlA
- InternetReadFile
- InternetCloseHandle
- The downloaded payload(QoNGqRlihlEHmyvHbhC131.bin) is still encrypted. Its decryption key is different from the one used to decrypt strings. Before it can decrypt the actual PE payload, It needs to decrypt this new key.
- A memory space is allocated and filled with the still encrypted key. The size of the key is 0x6F2. (This is similar to the first 2 steps on how it decrypt strings.)
- The start of the encrypted PE payload is +0x40 from the beginning of the downloaded payload.
- It will get the first 2 bytes of the encrypted PE and the first 2 bytes of the new key.
- It will loop and try to decrypt these 2 bytes of the encrypted PE by XORing it with the 2 bytes of the new key. Adding 1 to the key in each iteration until it equates to “MZ” and save the final outcome of the index.
- This index, which is equal to 0x10E7, will be used to decrypt the new key.
- A memory space is allocated and filled with the still encrypted key. The size of the key is 0x6F2. (This is similar to the first 2 steps on how it decrypt strings.)
- Now it can proceed in decrypting the PE payload
A quick analysis from Varist’s Hybrid Analyzer reports that it as a copy of Remcos RAT which is a common payload for Guloader.
Indicators of Compromise
Filename/URL | SHA256 | Description | Varist Detection |
korea_trade_product_order_specification_list_24_06_2024_0000000_pdf.vbs | 028a85e18dd99a848c0effc35a2dfca733965b21ee7f493774f2b942a1be1c72 | VBS file | VBS/Agent.BNK |
http[:]//194[.]59[.]31[.]137/Wattest.pfb | Initial Payload URL | ||
http[:]//94[.]156[.]8[.]88/Wattest.pfb | Initial Payload URL | ||
hxxp://194[.]59[.]31[.]137/QoNGqRlihlEHmyvHbhC131[.]bin | Final Payload URL | ||
hxxp://94[.]156[.]8[.]88/QoNGqRlihlEHmyvHbhC131[.]bin | Final Payload URL |
References
- https://www.mcafee.com/blogs/other-blogs/mcafee-labs/guloader-campaigns-a-deep-dive-analysis-of-a-highly-evasive-shellcode-based-loader/
- https://hidocohen.medium.com/guloaders-anti-analysis-techniques-e0d4b8437195
- https://soolidsnake.github.io/2021/09/05/GuLoader.html
- https://any.run/cybersecurity-blog/deobfuscating-guloader/
- https://www.elastic.co/security-labs/getting-gooey-with-guloader-downloader
- https://msdn.microsoft.com/en-us/library/system.intptr.size.aspx