Pagine

23 agosto 2008

sizeof( struct ) fails? No but...

Many of you asked me why the interesting C/C++/ROOT lessons suddenly stopped, the answer is that I had to focus on so many other duties that I couldn't find a single minute to spare, sorry, to invest in spreading all around a bit of knowledge about my favorite programming language.

But here I am back again to tell you a funny thing I've just discovered about the behavior of compilers when allocating memory space for unions, structs and classes. Let me introduce you my real life example:

I have to read a binary file that is organized in big blocks each of them is divided (as often happens) in three pieces: a header, a data block and a final trailer. In the header several information are contained and of utmost importance the size of the data block that is variable. A proper reading of the trailer is assuring that the reading is done in the correct way. It seems so simple that I decided to implement it right away in the most straightforward way. So, I prepare a C struct containing all the fields in the header block and start looping on the input file, but I got immediately a problem reading the first trailer. Why?

To better understand the problem, I chopped down the program in several pieces and surprisingly enough I discovered that the issue was related to a strange behavior of the sizeof() function when applied to my header struct. Here below the example:

struct header_struct {

   int   first;
   int   second;
   short third;

};

Intuitively I would have said that the sizeof(header_struct) should be the sum of twice the sizeof(int) and the sizeof(short), that on my 32 bit computer is 2 * 4 + 2 = 10. But, if you don't trust me, try to compile the following code to see what happens:

#include <iostream>
#include <cassert>

using namespace std;

struct header_struct {

   int first;
   int second;
   short third;

};

int main() {

   // print out some sizeof
   cout << "Size of int " << sizeof( int ) << endl
        << "Size of short " << sizeof( short ) << endl
        << "Size of header_struct " << sizeof( header_struct ) << endl;

   // assert the expected size
   assert( sizeof( header_struct ) == 2 * sizeof( int ) + sizeof( short ));

   return 0;
}




First of all you will notice that the size of header_struct is 12 bytes and not 10 as expected and moreover the assertion is failing. But why? Is sizeof failing?

The reason is quite simple. The effective size of our structure is 10 bytes that unfortunately is not an integer multiple of 4 ( == 32 bits) that is the atomic access of the memory on a standard 32 bit computer. Since some operating systems / platforms don't allow memory accesses not occurring at 32-bit aligned position, the compiler is padding the user defined structure with empty holes in order to have each and every struct member aligned with a 32-bit word in the memory.

In order to reduce to the minimum possible the memory waste, the user should try to order the structure data member from the largest to the smaller, so that no holes in between members are needed, but an empty hole at the end my be required anyway. This is particularly important when the compiler has to allocate the memory for a consecutive set of structures (i.e. an array of header_struct).

While this packing is making the operating system life easier, the user life may be by far more complicated. Mainly because it's not always possible to have a struct size properly aligned and, in some cases, the structure is defined by somebody else who didn't think/know about packing. In this case the user can ask the compiler not to pack the structure and try to have not aligned access to the memory, to do so, one has to add the following single line of code:

#pragma pack(1)

This instruction is forcing the compiler to think that the operating system can perform access to the memory aligned every byte, making possible any kind of access. Then it is user responsibility to verify if his own platform / operating system is well behaving or not!


Chiunque può lasciare commenti su questo blog, ammesso che vengano rispettate due regole fondamentali: la buona educazione e il rispetto per gli altri.

Per commentare potete utilizzare diversi modi di autenticazione, da Google a Facebook e Twitter se non volete farvi un account su Disqus che resta sempre la nostra scelta consigliata.

Potete utilizzare tag HTML <b>, <i> e <a> per mettere in grassetto, in corsivo il testo ed inserire link ipertestuali come spiegato in questo tutorial. Per aggiungere un'immagine potete trascinarla dal vostro pc sopra lo spazio commenti.

A questo indirizzo trovate indicazioni su come ricevere notifiche via email sui nuovi commenti pubblicati.

0 commenti:

Posta un commento

Chiunque può lasciare commenti su questo blog, ammesso che vengano rispettate due regole fondamentali: la buona educazione e il rispetto per gli altri.

Per commentare potete utilizzare diversi modi di autenticazione, da Google a Facebook e Twitter se non volete farvi un account su Disqus che resta sempre la nostra scelta consigliata.

Potete utilizzare tag HTML <b>, <i> e <a> per mettere in grassetto, in corsivo il testo ed inserire link ipertestuali come spiegato in questo tutorial. Per aggiungere un'immagine potete trascinarla dal vostro pc sopra lo spazio commenti.

A questo indirizzo trovate indicazioni su come ricevere notifiche via email sui nuovi commenti pubblicati.