For programmers who write code in C/C++, pointers are still a valid topic and the learning never stops. We discover new things related to pointers as we try to become more proficient programmers. I will be writing some basic points which we all should know while using pointers.
What is a pointer? Any data which needs to be stored needs memory space. Usually, this space is provided to us on the stack or the heap of the program. Hence, there must be a location on the stack frame where this data is stored. This is called the address. The data is represented in the program is identified using a variable name for convenience. If the variable is present in a function, it is called a local variable and is stored in the stack frame of the function. Say, the address is 1000, variable is ‘i’ and value is integer 10. Hence, the address represents the start of the variable which contains the data we store.
Now, can another variable store the address of any other variable? The answer is Yes. A Variable which holds the address of any other variable is called a pointer variable or pointer. Say, we have a variable ‘ptr’ which holds the address of ‘i’. Hence we say ‘ptr’ points to ‘i’. The contents of ‘ptr’ is 1000 which is the address of ‘i’ and provided by ‘&’ operator. Of course, ‘ptr’ itself being a variable needs a place to reside. Lets say that it has an address 2000.
int i = 10;
int *ptr = &i;
The above two statements represent how to assign a pointer variable and assign it an address. The important thing to note here is that ‘ptr’ is not an integer, but a pointer variable which can hold address of an integer variable. Now since, ‘ptr’ points to ‘i’, it should be able to retrieve the value of ‘i’, right? This is achieved using the ‘*’ operator. The value returned by ‘*’ is the pointer type, in this case an integer.
int j = *ptr;
Now we know that the above variables are using built in types and are all stored on the stack. Since the stack size is limited, usually data pointed to by pointers are stored in the heap. This means that we tell the compiler to allocate some amount of space on the heap and return the starting address to a pointer variable. How we traverse the memory allocated depends on the type of pointer allocated.
int *elem = (int*) malloc(sizeof(int) * 10);
In the above statement, we are allocating memory on the heap and the starting address is provided to elem. malloc( ) returns a void pointer by default, hence we have to type cast it to the type of the pointer variable. Here, space is allocated to store ten integers. Do note that the variable elem is on the stack from of the function, it only points to a location on the heap which is capable of holding ten integers.
When we are done using the memory allocated, we have to release the memory allocated on the heap. If the function returns without releasing the heap memory, we know that its stack frame is wound up and elem which is the local variable of the function is destroyed. We know the starting address of the heap allocated memory because it was stored in ‘elem’. If elem no longer exists, then we do not know where we had allocated the heap memory and will not be able to free it. This situation leads to memory leaks.
Here, free( ) takes the pointer variable of the starting address of the heap allocated memory. There is one question however. How does free know how much of memory should be release from the heap. It so happens that, when malloc( ) allocates memory and returns the address, some metadata about the allocated memory is stored few bytes prior to the address returned. The amount of memory allocated happens to be one of them, hence free would be able to perform the proper clean up.
Now since the pointer variable itself stores an address, when we perform the sizeof( ) operation on the variable, we would get the size occupied by the address. Say in the above scenario, ‘i’ is an integer and occupies 4 bytes. It’s value is 10, and the value 10 is stored at location 1000. Since i occupies 4 bytes, the next variable can take up the location 1004. The third integer can take up the location 1008, and so on. When ‘ptr’ is pointing to ‘i’, ‘ptr’ has the value 1000, which is represented using 32 bits (4 bytes) in hexadecimal. ‘ptr’ itself needs space which is at address 2000.
In the second scenario when we allocate 10 integer elements, say address 3000 is where the first integer will reside. Hence from the above explanation we can infer that second integer is at address 3004, third at 3008 and so on. The important is that elem has the address of the first integer and has no idea how many integers have been allocated. Because we allocated ten elements, we know there are ten elements, hence using pointer arithmetic, we can iterate through all the other nine elements. Hence when we use the sizeof( ) operator on elem, we would still get 4 bytes, because that is what it the size of an integer address.
Now if we are in a situation that we do not know how many elements have been allocated, or a single structure has been allocated and there is a need to know the space it occupies on the heap. The ‘*’ and sizeof( ) operator wouldn’t work in this case. As we discussed earlier, sizeof( ) provides the size of the pointer variable, which is 4 bytes. ‘*’ would de-reference the variable. Hence if the variable is structure, you would be able to access the data members of the structure and print them out say. If the variable was an integer pointer, like ‘ptr’, you would get the value of ‘i’. If the variable was elem, you would get the value of the first integer among the ten integers you would have initialized. The workaround here is to use _msize( ) in Windows or malloc_usable_size( ) in GNU/Linux to retrieve the space occupied on the heap pointed by a pointer. Now would this work? We know that free does a good job without specifying the number of elements, so this should too right? Why don’t you give it a shot, because it has surely worked for me.