BOLO: Reverse Engineering — Part 1 (Basic Programming Concepts) by Daniel A. Bloom
BOLO: Reverse Engineering — Part 1 (Basic Programming Concepts) by Daniel A. Bloom
(46 views)
Throughout the reverse engineering learning process I have found myself wanting a straightforward guide for what to look for when browsing through assembly code. While I’m a big believer in reading source code and manuals for information, I fully understand the desire to have concise, easy to comprehend, information all in one place. This “BOLO: Reverse Engineering” series is exactly that! Throughout this article series I will be showing you things to Be On the Look Out for when reverse engineering code. Ideally, this article series will make it easier for beginner reverse engineers to get a grasp on many different concepts!
Preface
Throughout this article you will see screenshots of C++ code and assembly code along with some explanation as to what you’re seeing and why things look the way they look. Furthermore, This article series will not cover the basics of assembly, it will only present patterns and decompiled code so that you can get a general understanding of what to look for / how to interpret assembly code.
please note: This tutorial was made with visual C++ in Microsoft Visual Studio 2015 (I know, outdated version). Some of the assembly code (i.e. user input with cin) will reflect that. Furthermore, I am using IDA Pro as my disassembler.
Variable Initiation
Variables are extremely important when programming, here we can see a few important variables:
a string
an int
a boolean
a char
a double
a float
a char array
Basic Variables
Please note: In C++, ‘string’ is not a primitive variable but I thought it important to show you anyway.
Now, lets take a look at the assembly:
Initiating Variables
Here we can see how IDA represents space allocation for variables. As you can see, we’re allocating space for each variable before we actually initialize them.
Initializing Variables
Once space is allocated, we move the values that we want to set each variable to into the space we allocated for said variable. Although the majority of the variables are initialized here, below you will see the C++ string initiation.
C++ String Initiation
As you can see, initiating a string requires a call to a built in function for initiation.
Basic Output
preface info: Throughout this section I will be talking about items pushed onto the stack and used as parameters for the printf function. The concept of function parameters will be explained in better detail later in this article.
Although this tutorial was built in visual C++, I opted to use printf rather than cout for output.
Basic Output
Now, let’s take a look at the assembly:
First, the string literal:
String Literal Output
As you can see, the string literal is pushed onto the stack to be called as a parameter for the printf function.
Now, let’s take a look at one of the variable outputs:
Variable Output
As you can see, first the intvar variable is moved into the EAX register, which is then pushed onto the stack along with the “%i” string literal used to indicate integer output. These variables are then taken from the stack and used as parameters when calling the printf function.
Mathematical Functions
In this section, we’ll be going over the following mathematical functions:
Addition
Subtraction
Multiplication
Division
Bitwise AND
Bitwise OR
Bitwise XOR
Bitwise NOT
Bitwise Right-Shift
Bitwise Left-Shift
Mathematical Functions Code
Let’s break each function down into assembly:
First, we set A to hex 0A, which represents decimal 10, and B to hex 0F, which represents decimal 15.
Variable Setting
We add by using the ‘add’ opcode:
Addition
We subtract using the ‘sub’ opcode:
Subtraction
We multiply using the ‘imul’ opcode:
Multiplication
We divide using the ‘idiv’ opcode. In this case, we also use the ‘cdq’ to double the size of EAX so that we can fit the output of the division operation.
Division
We perform the Bitwise AND using the ‘and’ opcode:
Bitwise AND
We perform the Bitwise OR using the ‘or’ opcode:
Bitwise OR
We perform the Bitwise XOR using the ‘xor’ opcode:
Bitwise XOR
We perform the Bitwise NOT using the ‘not’ opcode:
Bitwise NOT
We peform the Bitwise Right-Shift using the ‘sar’ opcode:
Bitwise Right-Shift
We perform the Bitwise Left-Shift using the ‘shl’ opcode:
Bitwise Left-Shift
Function Calls
In this section, we’ll be looking at 3 different types of functions:
a basic void function
a function that returns an integer
a function that takes in parameters
Calling Functions
First, let’s take a look at calling newfunc() and newfuncret() because neither of those actually take in any parameters.
Calling Functions Without Parameters
If we follow the call to the newfunc() function, we can see that all it really does is print out “Hello! I’m a new function!”:
The newfunc() Function Code
The newfunc() Function
As you can see, this function does use the retn opcode but only to return back to the previous location (so that the program can continue after the function completes.) Now, let’s take a look at the newfuncret() function which generates a random integer using the C++ rand() function and then returns said integer.
The newfuncret() Function Code
The newfuncret() function
First, space is allocated for the A variable. Then, the rand() function is called, which returns a value into the EAX register. Next, the EAX variable is moved into the A variable space, effectively setting A to the result of rand(). Finally, the A variable is moved into EAX so that the function can use it as a return value.
Now that we have an understanding of how to call function and what it looks like when a function returns something, let’s talk about calling functions with parameters:
First, let’s take another look at the call statement:
Calling a Function with Parameters in C++
Calling a Function with Parameters
Although strings in C++ require a call to a basic_string function, the concept of calling a function with parameters is the same regardless of data type. First ,you move the variable into a register, then you push the registers on the stack, then you call the function.
Let’s take a look at the function’s code:
The funcparams() Function Code
The funcparams() Function
All this function does is take in a string, an integer, and a character and print them out using printf. As you can see, first the 3 variables are allocated at the top of the function, then these variables are pushed onto the stack as parameters for the printf function. Easy Peasy.
Loops
Now that we have function calling, output, variables, and math down, let’s move on to flow control. First, we’ll start with a for loop:
For Loop Code
A graphical Overview of the For Loop
Before we break down the assembly code into smaller sections, let’s take a look at the general layout. As you can see, when the for loop starts, it has 2 options; It can either go to the box on the right (green arrow) and return, or it can go to the box on the left (red arrow) and loop back to the start of the for loop.
Detailed For Loop
First, we check if we’ve hit the maximum value by comparing the i variable to the max variable. If the i variable is not greater than or equal to the maxvariable, we continue down to the left and print out the i variable then add 1 to i and continue back to the start of the loop. If the i variable is, in fact, greater than or equal to max, we simply exit the for loop and return.
Now, let’s take a look at a while loop:
While Loop Code
While Loop
In this loop, all we’re doing is generating a random number between 0 and 20. If the number is greater than 10, we exit the loop and print “I’m out!” otherwise, we continue to loop.
In the assembly, the A variable is generated and set to 0 originally, then we initialize the loop by comparing A to the hex number 0A which represents decimal 10. If A is not greater than or equal to 10, we generate a new random number which is then set to A and we continue back to the comparison. If A is greater than or equal to 10, we break out of the loop, print out “I’m out” and then return.
If Statements
Next, we’ll be talking about if statements. First, let’s take a look at the code:
IF Statement Code
This function generates a random number between 0 and 20 and stores said number in the variable A. If A is greater than 15, the program will print out “greater than 15”. If A is less than 15 but greater than 10, the program will print out “less than 15, greater than 10”. This pattern will continue until A is less than 5, in which case the program will print out “less than 5”.
Now, let’s take a look at the assembly graph:
IF Statement Assembly Graph
As you can see, the assembly is structured similarly to the actual code. This is because IF statements are simply “If X Then Y Else Z”. IF we look at the first set of arrows coming out of the top section, we can see a comparison between the A variable and hex 0F, which represents decimal 15. If A is greater than or equal to 15, the program will print out “greater than 15” and then return. Otherwise, the program will compare A to hex 0A which represents decimal 10. This pattern will continue until the program prints and returns.
Switch Statements
Switch statements are a lot like IF statements except in a Switch statement one variable or statement is compared to a number of ‘cases’ (or possible equivalences). Let’s take a look at our code:
Switch Statement Code
In this function, we set the variable A to equal a random number between 0 and 10. Then, we compare A to a number of cases using a Switch statement. IfA is equal to any of the possible cases, the case number will be printed, and then the program will break out of the Switch statement and the function will return.
Now, let’s take a look at the assembly graph:
Switch Case Assembly Graph
Unlike IF statements, switch statements do not follow the “If X Then Y Else Z” rule, instead, the program simply compares the conditional statement to the cases and only executes a case if said case is the conditional statement’s equivalent. Le’ts first take a look at the initial 2 boxes:
The First 2 Graph Sections
First, the program generates a random number and sets it to A. Then, the program initializes the switch statement by first setting a temporary variable (var_D0) to equal A, then ensuring that var_D0 meets at least one of the possible cases. If var_D0 needs to default, the program follows the green arrow down to the final return section (see below). Otherwise, the program initiates a switch jump to the equivalent case’s section:
In the case that var_D0 (A) is equal to 5, the code will jump to the above case section, print out “5” and then jump to the return section.
User Input
In this section, we’ll cover user input using the C++ cin function. First, let’s look at the code:
User Input Code
In this function, we simply take in a string to the variable sentence using the C++ cin function and then we print out sentence through a printf statement.
Le’ts break this down into assembly. First, the C++ cin part:
C++ cin
This code simply initializes the string sentence then calls the cin function and sets the input to the sentence variable. Let’s take a look at the cin call a bit closer:
The C++ cin Function Upclose
First, the program sets the contents of the sentence variable to EAX, then pushes EAX onto the stack to be used as a parameter for the cin function which is then called and has it’s output moved into ECX, which is then put on the stack for the printf statement:
User Input printf Statement
Thanks!
Hopefully, this article gave you a decent understanding of how basic programming concepts are represented in assembly. Keep an eye out for the next part of this series, BOLO: Reverse Engineering — Part 2 (Advanced Programming Concepts)!
Hakin9 is a monthly magazine dedicated to hacking and cybersecurity. In every edition, we try to focus on different approaches to show various techniques - defensive and offensive. This knowledge will help you understand how most popular attacks are performed and how to protect your data from them. Our tutorials, case studies and online courses will prepare you for the upcoming, potential threats in the cyber security world. We collaborate with many individuals and universities and public institutions, but also with companies such as Xento Systems, CATO Networks, EY, CIPHER Intelligence LAB, redBorder, TSG, and others.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent. Read MoreCookie Settings Reject AllAccept
Manage consent
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.