
By YELIA MAMDOUH EL GHALY
When we start to check the PDF files that exist in our PC or laptop, we may use an antivirus scanner but these days, it seems they're not good enough to detect malicious PDF files that contain a shell code because an attacker will mostly encrypt its content to bypass the antivirus scanner and in many times target a zero day vulnerability that exist in Adobe Acrobat Reader or in updated version.
Before we start to analyze malicious PDFs, we are going to have a simple look at PDF structures so we can understand how the shell code works and where it;s located.
PDF components
PDF Header
The first line of a PDF shows the PDF format version. It's the most important line that gives you the basic information of the PDF file, for example “%PDF-1.4 means that file was created with the fourth version.
PDF Body
The body of the PDF file consists of objects that compose the contents of the document. These objects include fonts, images, annotations, text streams and the user can put invisible objects or elements. These objects can interact with PDF features like animation and security features. The body of the PDF file supports two types of numbers (integers, real numbers).
The Cross-Reference Table (xref table)
The cross- reference table contains links of all objects and elements that exist on file format, and you can use this feature to see other pages contents (when the users update the PDF, the cross-reference table gets updated automatically).
The Trailer
The trailer contains links to the cross-reference table and always ends up with %%EOF to identify the end of a PDF file. The trailer enables a user to navigate to the next page by clicking on the link provided.
Malicious PDF through Metasploit
Now after we have taken a tour inside the PDF file format and what it contains we will start to install old versions of Adobe Acrobat Reader 9.4.6 and 10 through to 10.1.1 that will be vulnerable to Adobe U3D Memory Corruption Vulnerability. These exploits exist in Metasploit framework so we going to create the malicious PDF and analyze it in KALI Linux distribution. Open the terminal and type msfconsole. We going to set some Metasploit variables to be sure that everything is working fine.
*After choosing the exploit type, we are going to choose the payload that will execute during exploitation in the remote target and open Meterpreter session. The file has been saved on /root/.msf4/local.
So we are going to move the file to the Desktop for easier access by typing in the terminal
root@kali :~# cd /root/.msf4/local
root@kali :~# mv msf.PDF /root/Desktop
PDFid
Now we are going to use PDFid to see what the PDF contains, like elements and objects and JavaScript, and see if there is something interesting to analyze. The PDF has only one page, maybe it's normal. There are several JavaScript objects inside… this is very strange. There is also an OpenAction object which will execute this malicious JavaScript So we are going to use peepdf.
Peepdf
Peepdf is a Python tool that is very powerful for PDF analysis. The tool provides all the necessary components that a security researcher might need in PDF analysis without using too many tools to do that, and it support encryption, Object Streams, Shellcode emulation, Javascript Analysis, and for Malicious PDFs, it shows potential vulnerabilities, shows suspicious elements, has a powerful interactive console, PDF obfuscation (bypassing AVs), Decoding:
hexadecimal – ASCII and HEX search.
Analysis
If we are going to start analysis, go to the directory of the PDF file then start with syntax /usr/bin/peepdf–f msf.pdf.
*choose the LHOST which is our IP address and we can view through typing ifconfig in new terminal
*finally we type exploit to create the PDF file with configuration we created before
We use –f option to avoid errors and force the tool to ignore them. This the default output but we see some interesting things. The first one we see is the highlighted one object 15 continue JavaScript code and we have also one object 4 continue two executing elements (/AcroForm & /OpenAction) and the last one is /U3D showing to us a known vulnerability for now we will start to explore these objects by getting an interactive console by typing syntax /usr/bin/peepdf –i msf.pdf
The tree commands show the logical structure of the file, and starting explore object 4 (/Acro-Form).
When we type object 4 it gave you another object to explore. For now, we didn’t see any important information or anything that seems suspicious except object 2 (XFA array) that gave us the element <fjdklsaj fodpsaj fopjdsio> and it seems to us that it doesn't contain anything special. Let’s move to the another object (Open Action).
Now we can see JavaScript code, that will be executed when the PDF file is opened. The other part of the JavaScript code is barely obfuscated, like writing some variables in hex, and in this code we can see a heap spraying with shell code plus some padding bytes. The attackers typically use unicode to encode their shell code and then use the unescape function to translate the unicode representation to binary content (now we are sure that it is definitely a malicious PDF)
Defend
We defend our network from that type of malicious file by providing strong e-mail and web filters, IPS and by application control: disable JavaScript and disable PDF rendering in browsers, block PDF readers from accessing file system and network resources. Security awareness is important.
Author
