C and C++ Secure Code Practices - Part 1 | Lucideus Research

Software vulnerabilities are currently, and have been since the advent of multiuser and networked computing, a major cause of computer security incidents. This series of blog will focus on implementation errors in C and  C++, as well as the countermeasures that have been proposed and developed to deal with these vulnerabilities.

Implementation Vulnerabilities and exploitation techniques

a) Format Functions and Strings
The format string exploit occurs when the submitted data of an input string will be evaluated as command, which in turn can be used to manipulate stack pointers to desired execution sequences.

To understand the attacks , first we need to understand few basics terminology:
1)Format Function: ANSI C conversion function, like printf, fprintf, which converts a primitive variable of the programming language into a human-readable string representation
2)Format String: Argument of Format Function.
3)Format String Parameter: parameters like %x %s defines the type of conversion of the format function. 

printf ("The magic number is: %d\n", 1911);

printf - Format Function.
“The magic number is: “ - Format argument.
“%d”: Format string Parameter.

b) Stack and its Role
printf ("a has value %d, b has value %d, c is at address: %08x\n", a, b, &c);

The data is received  by format functions from stack. 

Will Printf() detect something wrong?
The function printf() fetches the arguments from the stack. If the format string needs 3 arguments, it will fetch 3 data items from the stack. Unless the stack is marked with a boundary, printf() does not know that it runs out of the arguments that are provided to it.

Since there is no such a marking.printf() will continue fetching data from the stack. In a miss-match case, it will fetch some data that do not belong to this function call.



For each %s, printf() will fetch will fetch a number from the stack, treat this number as an address, and print out the memory contents pointed by this address as a string, until a NULL character. 

Since the number fetched by printf() might not be an address, the memory pointed by this number might not exist (i.e. no physical memory has been assigned to such an address), and the program will crash.

It is also possible that the number happens to be a good address, but the address space is protected (e.g. it is reserved for kernel memory). In this case, the program will also crash.

One format specifier is particularly interesting to attackers: %n. This specifier will write the amount of characters that have been formatted so far to a pointer that is provided as an argument to the format function.

Thus if attackers are able to specify the format string, they can use format specifiers like %x (print the hex value of an integer) to pop words off the stack, until they reach a pointer to a value they wish to overwrite. This value can then be overwritten by crafting a special format string with %n specifiers.

Example1: #include <stdio.h> #include <string.h> #include <stdlib.h> int main (int argc, char **argv) { char buf [100]; int x = 1 ; snprintf ( buf, sizeof buf, argv [1] ) ; buf [ sizeof buf -1 ] = 0; printf ( “Buffer size is: (%d) \nData input: %s \n” , strlen (buf) , buf ) ; printf ( “X equals: %d/ in hex: %#x\nMemory address for x: (%p) \n” , x, x, &x) ; return 0 ; } Here X can be the beginning location of user password.

Denial of Service
In this case, when an invalid memory address is requested, normally the program is terminated.

printf (userName);

The attacker could insert a sequence of format strings, making the program show the memory address where a lot of other data are stored, then, the attacker increases the possibility that the program will read an illegal address, crashing the program and causing its non-availability.

printf (%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s);

Integer Exploits
An attacker can write integer to nearly any location in memory .
%n -  The number of characters written so far is stored into the integer indicated by the corre-
sponding argument.

For Example

#include  <stdio.h>
#include  <string.h>
#include  <stdlib.h>

int main (int argc, char **argv)
char buf [100];
int x ; 
printf(“12345%n”,&x); // writes 5 to x
return 0 ;

Attacker can go upto a flexibility of viewing the binaries and changing them as they like if the code has been developed carelessly.

Line in below example:
#include  <stdio.h>
#include  <string.h>
#include  <stdlib.h>

int main (int argc, char **argv)
char user_input[100];

Now if user_input is : "\x10\x0F\x40\x48 %x %x %x %x %s" -- This will print the memory output of the location 0x100F4048 and then if we replace %s with %n we can change the value at that address.
Integer Sign Caution
Integer signedness errors on the other hand are more subtle: when the programmer defines an integer, it is assumed to be a signed integer, unless explicitly declared unsigned. When the programmer later passes this integer as an argument to a function expecting an unsigned value, an implicit cast will occur.

This can lead to a situation where a negative argument passes a maximum size test but is used as a large unsigned value afterwards, possibly causing a buffer or heap overflow if used in conjunction with a copy operation (e.g. memcpy 3 expects an unsigned integer as size argument and when passed a negative signed integer, it will assume this is a large unsigned value).

Suppose attacker wants to get some restricted data from some os:

A memcpy function is declared as such:
void * memcpy ( void * destination, const void * source, size_t num );

Where size_t is unassigned type, in that case, A function like:
void copyRAWdata(void* userData, int maxLen)
int len = KSIZE < maxLen ? KSIZE:maxLen;

where KSIZE is the maximum number of bytes we want to allow for the user to copy. If the caller sends a positive value for maxlen, the function works as expected. But if the caller sends a negative value for maxlen, then the comparison would pass and memcpy's third parameter would be that negative value. As it is converted to unsigned, the number of bytes copied would be huge, thus the caller may get restricted data

Address randomisation: just like the countermeasures used to protect against buffer-overflow
attacks, address randomization makes it difficult for the attackers to find out what address they
want to read/write.

Some safe languages performs explicit  type-checking for format functions to determine the type of its arguments, and comparing the type that was given as an argument to the function to the type of argument that the format specifier expects.

Taint Analysis is also helpful. Taint analysis marks all user input as tainted and will
report an error when a variable that is expected to be untainted is derived from
a tainted value.

Some library wrappers over these format strings can also be used to avoid malicious use of format functions. 

Post a Comment