Featured Post

The Infamous “Multiple Events” Problem in FileSystemWatche

Whoever, tries the FileSystemWatcher Class for first time, immedietly starts complaining about multiple event notifications and after a brief moment complains, that .NET is buggy and concludes that as a bug. Had heard this many times, and again recently came up with this issue… Actually, the real...

Read More

PERFORMANCE Code Killer – Unaligned Memory - C# Structs

Posted by Logu Krishnan | Posted in C#, Performance | Posted on 24-02-2009

Tags: ,

1

Recently I was analyzing a .NET Application for performance which had lots of structs defined in it, and happened to hit a strange reality. Unaligned Memory problem!
I was running a profiler, and found that the memory allocated for few structs are huge than it should normally allocate (based on my own math of counting the bytes). When I probed further, there was an interesting discovery. Read on…

Alright here is a little head spinner… What is the difference between the following structures?

struct BadStructure
{
char c1;
int i;
char c2;
}

Nothing much, except the jumbled type declarations… Huh?

Fine, Now let’s look at the size of these structures,

struct GoodStructure
{
int i;
char c1;
char c2;
}

The size of BadStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 12 Bytes, Marshal.Sizeof = 12 Bytes

The size of GoodStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 8 Bytes, Marshal.Sizeof = 8 Bytes

[Note: Size of int=4, char=2]

The Reason behind these differences is “BYTE ALIGNMENT”, As with the default packing in unmanaged C++, integers are laid out on four-byte boundaries, so while the first
character uses two bytes (a char in managed code is a Unicode character, thus occupying two bytes), the integer moves up to the next 4-byte boundary, and the second character uses the subsequent 2 bytes. The resulting structure is 12 bytes when measured with Marshal.SizeOf.

32 bit microprocessors typically organize memory as shown below.


        Byte0  Byte1  Byte2 Byte3

0×1000

0×1004   A0     A1     A2     A3

0×1008

0×100C          B0     B1     B2

0×1010  B3

Most of the processer architectures cannot read data from odd addresses.
Processor Architectures are inefficient in reading the data if it starts at an address not divisible by four.
Memory is accessed by performing 32 bit bus cycles. 32 bit bus cycles can however be performed at addresses that are divisible by 4. So for efficiency purposes, compilers add the so-called pad bytes. The reasons for not permitting misaligned long word reads and writes are not difficult to see. For example, an aligned long word A would be written as A0, A1, A2 and A3.

Here is the IL.

.class nested private sequential ansi sealed beforefieldinit BadValueType

extends [mscorlib]System.ValueType
{
.field public char c1
.field public char c2
.field public int32 i
}
In the .NET Framework 3.5, the JIT does enforce a Sequential layout (if specified) for the managed layout of value types,
We can use the System.Runtime.InteropServices namespace and the StructLayoutAttribute class to control the physical layout of the data fields in the Microsoft .NET Framework 3.5

Thus the microprocessor can read the complete long word in a single bus cycle. If the same microprocessor now attempts to access a long word at address 0×100D, it will have to read bytes B0, B1, B2 and B3. Notice that this read cannot be performed in a single 32 bit bus cycle. The microprocessor will have to issue two different reads at address 0×100C and 0×1010 to read the complete long word. Thus it takes twice the time to read a misaligned long word.

The following byte padding rules will generally work with most 32 bit processor.
a. single byte numbers can be aligned at any address
b. Two byte numbers should be aligned to a two byte boundary
c. Four byte numbers should be aligned to a four byte boundary

This is the cause of the difference.
Fine…. How do we fix this ?

the .NET compilers all apply a StructLayoutAttribute to structures, specifying a Sequential layout. This means that the fields are laid out in the type according to their order in the source file.
Fix = specify StructLayout [LayoutKind Sequential,Pack = 1] for the struct.

Watchout for structures when you create them next time, and think about playing around with ‘m’ structures with ‘n’ size…. m x n = !!!

  • Share/Save/Bookmark