Home >Operation and Maintenance >Safety >How to conduct in-depth analysis of Vietnamese APT attack samples
APT has become a hot topic in the security field.
Innovación y laboratorio, a subsidiary of Eleven Paths, published the "Docless Vietnam APT" report in April:
The above information states that we have detected some malicious The email was sent to a mailbox belonging to the Vietnamese government. The date of this Vietnamese email is March 13, 2019. There are suspicious elements in this email, which may come from within the Vietnamese government, but it cannot be ruled out that someone sent the email to the security department.
TKCT quy I nam 2019.doc.lnk.malw sample information is as follows:
Picture 1: TKCT quy I nam 2019.doc.lnk.malw
1. After the TKCT quy I nam 2019.doc.lnk.malw sample is downloaded locally, it is cleverly disguised as a Word shortcut, tricking the victim into running or habitually double-clicking it. View it as follows:
Picture 2: Disguise doc shortcut
First of all, word documents are generally not .lnk links, and the size of the link should be in It is about 1kb, but the APT sample shortcut is 126kb. It is obvious that something else is hidden. Many viruses disguise the file name as .dat, .docx, etc. In fact, the suffix is .exe. You can see it by turning on the file name extension.
2. Extract the malicious code contained in the target in the sample attributes and find that it is an obfuscated and encrypted cmd command, and execute powershell, as shown below:
Picture 3: Obfuscated instructions
APT attack methods favor VBS, PowerShell, JS and other types of scripting languages, which are easy to obfuscate and encrypt, and are convenient for anti-virus processing.
Therefore, viruses and malware in the past have this commonality. For example, Manlinghua, Xbash, and ransomware all like to execute powershell as the first "payload" and give it to the computer as a surprise.
3. After parsing the obfuscation instructions, we found that the TKCT quy I nam 2019.doc.lnk shortcut was redirected to the s.sp1 file. In fact, the desktop shortcut is a powershell script file. The obfuscation variables are sorted out as follows:
Finally redirect TKCT quy I nam 2019.doc.lnk to the temp folder, name it s.ps1 and execute the powershell, as shown below:
Picture 4: Deobfuscation
4. It is also possible to manually remove the iex confusion. Open the file and remove the characters "iex". The powershell running command format is: file Name (original) >> s.sp1 (new file name). The redirect file is as follows:
## Picture 5: Powershell malicious code 5. As shown in Figure 5, two pieces of Base64-encoded data were found. Powershell intends to execute the encoded malicious data, add scheduled tasks, and execute creation every 9 minutes. Using The InstallUtil service realizes self-starting and persistence. It is interesting to use Wscript.Shell to execute, and the more unexpected it is, the more unexpected the effect is, as shown below:Picture 6: s.sp16. After parsing the Base64 encoded data, there are actually two executable files, namely malicious .net and .doc. The execution method is shown in the figure below: Picture 7: Common routine7. Analyze tmp_pFWwjd.dat.exe and disassemble it through dnSpy tool Finally, the code is clearly visible. Although there is some minor confusion (you can use de4dot.exe to obfuscate it), it does not affect the code level. Locate the key function Exec() and find the Base64String encoded data. According to the execution process, .NET delegates the call to the function Call, and it is readable and writable. Then you need to apply for memory VirtualAlloc(), and Copy the Base64 shellcode to the requested memory, obtain the CreateThread() pointer, entrust the execution callback pointer, and execute the malicious shellcode, as shown below:
Picture 8: .net disassembly
How to understand shellcode or payload? Essentially, it can be said to be a bunch of hexadecimal data. For executable files, it is data that can be interpreted and executed by the assembly engine.
Because the following will involve binary data extraction and assembly analysis, here is a simple diagram to introduce the basic concepts of shellcode or payload (for malicious code), as shown in the following figure:
Picture 9: payload
As shown in Figure 9, for viruses, especially self-starting and persistent attacks, it is difficult to implement without a backdoor (the vulnerability persists (except for pulling a small amount of traffic each time for data theft).
For example, the base address of ASLR address randomization will change every time the system is restarted, or it will be injected into a new process space. The function base address of the code itself is wrong. How to accurately obtain the function address? To achieve perfect operation, some extra operations are required. Of course, these are not problems, and the technology is relatively mature.
Assuming that readers have certain knowledge about Windows PE format, assembly and kernel, the following knowledge explanation will be easier to understand. I will not popularize basic knowledge. I will analyze how to do these shellcodes from the PE format and assembly level, how to dynamically obtain the module base address such as kernel32.dll, how to use hash values or Hash to traverse the export table, and avoid killing sensitive strings. and sensitive function API, etc.
Take the above malicious code as an example. Because .net encountered many problems during shellcode debugging, we used c/c to restore tmp_pFWwjd.dat.exe. sample.
Once again, OD is used for dynamic debugging. Of course, you can Dump it for analysis. According to your personal preferences, you can directly interrupt the execution of the shellcode, as shown below:
Picture Ten: Pointer
1. Enter the entry point, then XOR decrypt the data, and restore the malicious code that needs to be actually executed, as shown below:
Picture 11: XOR decryption
2. The hash value encrypts the string, which has the advantage of reducing the size of the shellcode, and can hide sensitive characters, making it difficult for anti-virus software to intercept, as shown below:
Picture 12: Hash value acquisition function address
3. We enter function 1E0A42 and find a bunch of seemingly ordinary assignment operations, as follows Display:
Picture 13: fs:[0x30]
Fs is a register, kernel state is fs = 0x30, user state fs = 0x3B, fs Point to _KPCR in kernel mode and point to _TEB in user mode.
TEB (Thread Environment Block), thread environment block, that is to say, each thread will have a TEB, which is used to save data between the system and the thread to facilitate operation control. Then Fs:[0x30] is the PEB process environment block.
4. PEB is the current process environment. Shellcode can easily obtain PEB information. It obtains _PEB_LDR_DATA through offset 0xc. This structure contains information about the loaded modules of the process.
A bidirectional circular linked list is obtained through offset 0x1c. Each linked list points to the LDR_DATA_TABLE_ENTRY structure. Let's take a look at the data contained in this structure. The above data offset is related to the operating system, as shown below:
Picture 14: Obtain module matrix
Picture 15: Obtain current environment module base address Steps
5. Pass The above process will successfully obtain ntdll.dll, as shown below:
Picture 16: Obtain module Address
6. Continue to analyze the function 1E0B2A, two parameters. According to the function calling convention, parameter 1 is the kernel32 base address, parameter 2 is the hash value of the function name, and this function is a self-implemented GetProcAddress() function, as shown below:
Picture 17: GetProcAddress
The purpose of this function is to check whether it complies with the PE standard format and obtain the NT header and export table. The export table saves the addresses of three tables. Let’s look at the export table structure first, as shown below:
The malicious code needs to locate the addresses of these three tables, traverse the function name table AddressOfName, obtain the function name and calculate the hash value. If the hash value is the same as parameter 2, it means is the same function.
Return the currently traversed subscript, use the subscript to go to the function sequence number table AddressOfNameOrdinals to find the corresponding sequence number, obtain the value saved in the sequence number table, and obtain the AddressOfFunctions in the function address table. The three are simply expressed as shown in the figure below. Relationship:
Picture 18: Relationship between the three
As shown in the figure above, the serial number table and the name table correspond one to one, and the subscripts and subscripts are in The stored values are related. These three tables are cleverly designed and utilize the concept of relational database.
It should be noted that the serial numbers are not in order and there will be gaps. Some functions in the address table do not have function names, that is, the address table has addresses but cannot be associated with the name table. At this time, the serial number is used to call. The serial number content plus the Base serial number base address is the real call number.
8. With this knowledge in mind, if you look at the malicious code in the sample, you will find that it is exactly the same as the above description, as shown below:
Picture 19: GetProcAddress()
9. The final verification result is successful, as shown below:
Picture 20: Verification
11. A new thread is created, and the thread callback will create directories and files, but the local verification fails to create the file, as shown below:
Picture 21: Create directory
Create file and directory name path c:\User\......\AppData\Roaming\
11. The server responds and downloads the malicious code, which will start a new journey, as shown below:
# #Picture 22: DownLoader 4. vkT2 module analysis1. After following the function, we found a large number of hash values, which are obtained dynamically The function address is consistent with the above function call. The function name is as follows: 1E0AAA function decryption is as follows: 2 . After doing the preheating operation, in fact, according to the function name, you should guess what will happen next, as shown in the following figure: Picture 23: InternetOpenAPicture 24: InternetConnectA Picture 25: HttpOpenRequestA3. Dynamic debugging process The request will be interrupted and the code will be analyzed statically. lstrcmpiA will compare the fingerprint information text field of the downloaded data to see if it is plain, and then use InternetReadFile to read the downloaded data and execute it. Otherwise, it will fall into sleep and request an infinite loop. Picture 26: Request statusWe directly access the webpage based on the known IP and request format, as shown below:
Picture 27: vkT2 Found that the web parsing is all garbled? Download it locally and follow the original code execution process. This is a piece of binary data. According to the old rules, write a program to debug this malicious code.
4. vkT2 analysis, first decrypts the data, and then dynamically obtains the function address, which is an old routine used by samples.
The data of each section (section table) is circularly spliced. Here, according to the VirtualAddress, the address of each section after being loaded into the memory is circularly spliced. The memory alignment granularity is 0x1000, and the DOS header feature code is erased to form a PE. Format file, as shown below:
Picture 28: Memory expansion
5. Since it is PE format Expanded into the memory, the next step is to repair the IAT table and relocation. There are many aspects involved here, and it is also a PE format content. You can check the "Windows Authoritative Guide", as shown in the following figure:
Picture 29: Repair IAT
6. Next, analyze the key points, obtain the system variables, and determine whether it is running on a 64bit system, as shown below:
Picture 30: Determining the operating environment
8. System data, host ip, Host name and other information are collected as follows:
##Picture 31: Data collection 9. A new round of C&C communication begins, as shown in the following figure: Picture 32: Establishing communication For further analysis, the file can be opened using the HttpOpenRequest and HttpSendRequest functions. HttpOpenRequest creates a request handle and stores the parameters in the handle. HttpSendRequest sends the request parameters to the HTTP server, as follows: Picture 33: HttpOpenRequest Picture 34: HttpSendRequestA11. Unfortunately, HttpSendRequeSt no longer exists What responded, statically analyzed the remaining code (simulated execution), read the malicious code returned by the server, and used the thread safety context. Intelligence analysis did not find more valuable data, but this request method is very unique, and the constructed data packet is also very special. The specificity of this will be discussed below. Associating the sample process, the combing execution flow chart is as follows: Picture 35: TKCT quy I nam 2019.doc execution processAs shown in Figure 35, the servers for client communication should all be proxy servers. In fact, the real environment is far more complicated than the above process, which is also the difficulty in tracing the source. As shown in Figure 33, key data information is extracted from the stack memory. This is different from the usual request data we see. The summary is as follows:
APT communication methods are becoming more and more cautious. If detailed sample analysis, sandbox simulation operation, memory forensics, and packet capture tools are not used to analyze the network level, the results may be different from the desired data. When the sample communicates, it actually uses domain front-end network attack technology. What is domain front-end network attack technology? To put it simply, teams such as msf and cs (Cobalt Strike) can control server traffic in order to bypass a certain degree of firewalls and detectors. Some larger manufacturers will provide services, so tools such as msf or cs can be used to bypass firewalls and detectors to a certain extent. can be realised.
We use the Cobalt Strike tool as an example, which integrates port forwarding, scanning multi-mode port Listener, Windows exe program generation, Windows dll dynamic link library generation, java program generation, office macro code generation, including site cloning to obtain the browser related information, etc.
One of the more useful functions is the behavior of the Beacon payload, modifying the default attribute values of the framework, changing the frequency of check-in and modifying the Beacon network traffic. The configuration of these functions is in the file Malleable C2.
Malleable-C2-Profiles function can construct a normal Web disguised traffic, and ultimately achieve the effect of communication concealment. We take amazon.profile as an example, as shown below:
set useragent "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko";http-get {Seturi"/s/ref=nb_sb_noss_1/167-3294888-0262949/field-keywords=books";client {header "Accept" "*/*";header "Host" " www.amazon.com ";metadata {base64;prepend "session-token=";prepend "skin=noskin;";append "csm-hit=s-24KU11BB82RZSYGJ3BDK|1419899012996";header "Cookie";}http-post {set uri "/N4215/adj/amzn.us.sr.aps";client {header "Accept" "*/*";header "Content-Type" "text/xml";header "X-Requested-With" "XMLHttpRequest";header "Host" " www.amazon.com ";parameter "sz" "160x600";parameter "oe" "oe=ISO-8859-1;";id {parameter "sn";}parameter "s" "3717";parameter "dc_ref" "http%3A%2F%2F www.amazon.com ";}
The above code completely matches the communication characteristics of the sample vkT2.shellcode. By loading the corresponding profile file, the traffic characteristics of the target host and server are changed to hide the traffic and ultimately achieve the purpose of concealing communication.
The above is the detailed content of How to conduct in-depth analysis of Vietnamese APT attack samples. For more information, please follow other related articles on the PHP Chinese website!