One Memory, Multiple Views: A Complete Guide to Unions in C

While structures allow you to group multiple variables together, each with its own memory location, unions take a different approach—they allow you to store different data types in the same memory location. Think of a union as a single block of memory that can be interpreted in multiple ways. This memory-saving feature is particularly useful in systems programming, embedded systems, and protocol implementation.

What Is a Union?

A union is a special data type that allows you to store different data types in the same memory location. Unlike structures where each member has its own storage, all members of a union share the same memory space. The size of a union is the size of its largest member.

union Data {
int i;
float f;
char str[20];
};

In this union:

  • i, f, and str all share the same memory location
  • The size will be 20 bytes (size of str[20], the largest member)
  • You can only use one member at a time

Declaring and Using Unions

Basic Union Example:

#include <stdio.h>
#include <string.h>
union Data {
int i;
float f;
char str[20];
};
int main() {
union Data data;
printf("Size of union: %lu bytes\n", sizeof(data));
// Using integer member
data.i = 42;
printf("data.i = %d\n", data.i);
// Using float member (overwrites integer)
data.f = 3.14159;
printf("data.f = %f\n", data.f);
printf("data.i (now corrupted) = %d\n", data.i);  // Garbage value
// Using string member (overwrites float)
strcpy(data.str, "Hello");
printf("data.str = %s\n", data.str);
printf("data.f (now corrupted) = %f\n", data.f);  // Garbage
return 0;
}

Output (values may vary):

Size of union: 20 bytes
data.i = 42
data.f = 3.141590
data.i (now corrupted) = 1078530000
data.str = Hello
data.f (now corrupted) = 1143139122437582500000000000000.000000

Unions vs Structures

The key difference is memory allocation:

#include <stdio.h>
struct StructData {
int i;
float f;
char str[20];
};
union UnionData {
int i;
float f;
char str[20];
};
int main() {
struct StructData s;
union UnionData u;
printf("Structure size: %lu bytes\n", sizeof(s));
printf("  - int offset: %lu\n", offsetof(struct StructData, i));
printf("  - float offset: %lu\n", offsetof(struct StructData, f));
printf("  - char[20] offset: %lu\n", offsetof(struct StructData, str));
printf("\nUnion size: %lu bytes\n", sizeof(u));
printf("All members share the same address\n");
printf("Address of u.i: %p\n", (void*)&u.i);
printf("Address of u.f: %p\n", (void*)&u.f);
printf("Address of u.str: %p\n", (void*)&u.str);
return 0;
}

Output:

Structure size: 28 bytes
- int offset: 0
- float offset: 4
- char[20] offset: 8
Union size: 20 bytes
All members share the same address
Address of u.i: 0x7ffc12345670
Address of u.f: 0x7ffc12345670
Address of u.str: 0x7ffc12345670

Type Field Pattern (Tagged Unions)

Since you can't know which member was last written, unions are often used with a separate field to track the type:

#include <stdio.h>
#include <string.h>
typedef enum {
TYPE_INT,
TYPE_FLOAT,
TYPE_STRING
} DataType;
typedef struct {
DataType type;
union {
int i;
float f;
char str[20];
} data;
} TaggedUnion;
void printValue(TaggedUnion *value) {
switch (value->type) {
case TYPE_INT:
printf("Integer: %d\n", value->data.i);
break;
case TYPE_FLOAT:
printf("Float: %f\n", value->data.f);
break;
case TYPE_STRING:
printf("String: %s\n", value->data.str);
break;
}
}
int main() {
TaggedUnion values[3];
values[0].type = TYPE_INT;
values[0].data.i = 42;
values[1].type = TYPE_FLOAT;
values[1].data.f = 3.14159;
values[2].type = TYPE_STRING;
strcpy(values[2].data.str, "Hello, Unions!");
for (int i = 0; i < 3; i++) {
printValue(&values[i]);
}
return 0;
}

Output:

Integer: 42
Float: 3.141590
String: Hello, Unions!

Practical Applications

Example 1: IP Address Representation

#include <stdio.h>
#include <arpa/inet.h>
typedef union {
uint32_t ipv4_address;      // As a 32-bit integer
unsigned char octets[4];     // As 4 separate octets
} IPv4Address;
int main() {
IPv4Address addr;
// Set IP address 192.168.1.100
addr.octets[0] = 192;
addr.octets[1] = 168;
addr.octets[2] = 1;
addr.octets[3] = 100;
printf("IP Address (octets): %d.%d.%d.%d\n",
addr.octets[0], addr.octets[1], 
addr.octets[2], addr.octets[3]);
printf("IP Address (as integer): %u (0x%08X)\n", 
addr.ipv4_address, addr.ipv4_address);
// Network byte order vs host byte order
uint32_t network_order = htonl(addr.ipv4_address);
printf("Network byte order: 0x%08X\n", network_order);
return 0;
}

Example 2: Variant Data Type

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_VALUES 10
typedef enum {
INT_TYPE,
DOUBLE_TYPE,
CHAR_TYPE,
STRING_TYPE
} ValueType;
typedef struct {
ValueType type;
union {
int int_val;
double double_val;
char char_val;
char *string_val;
} data;
} Variant;
Variant create_int(int val) {
Variant v;
v.type = INT_TYPE;
v.data.int_val = val;
return v;
}
Variant create_double(double val) {
Variant v;
v.type = DOUBLE_TYPE;
v.data.double_val = val;
return v;
}
Variant create_char(char val) {
Variant v;
v.type = CHAR_TYPE;
v.data.char_val = val;
return v;
}
Variant create_string(const char *val) {
Variant v;
v.type = STRING_TYPE;
v.data.string_val = malloc(strlen(val) + 1);
strcpy(v.data.string_val, val);
return v;
}
void print_variant(Variant v) {
switch (v.type) {
case INT_TYPE:
printf("%d", v.data.int_val);
break;
case DOUBLE_TYPE:
printf("%f", v.data.double_val);
break;
case CHAR_TYPE:
printf("'%c'", v.data.char_val);
break;
case STRING_TYPE:
printf("\"%s\"", v.data.string_val);
break;
}
}
void free_variant(Variant v) {
if (v.type == STRING_TYPE) {
free(v.data.string_val);
}
}
int main() {
Variant values[MAX_VALUES];
int count = 0;
values[count++] = create_int(42);
values[count++] = create_double(3.14159);
values[count++] = create_char('A');
values[count++] = create_string("Hello");
values[count++] = create_int(100);
printf("Variant array:\n");
for (int i = 0; i < count; i++) {
printf("values[%d] = ", i);
print_variant(values[i]);
printf("\n");
}
// Clean up
for (int i = 0; i < count; i++) {
free_variant(values[i]);
}
return 0;
}

Example 3: Packet Parsing

#include <stdio.h>
#include <stdint.h>
#include <string.h>
typedef struct {
uint8_t version;      // IP version (4 or 6)
union {
struct {
uint8_t header_length;
uint8_t tos;
uint16_t total_length;
uint16_t identification;
uint16_t flags_fragment;
uint8_t ttl;
uint8_t protocol;
uint16_t checksum;
uint32_t source_ip;
uint32_t dest_ip;
} ipv4;
struct {
uint32_t flow_label;
uint16_t payload_length;
uint8_t next_header;
uint8_t hop_limit;
uint8_t source_ip[16];
uint8_t dest_ip[16];
} ipv6;
} header;
} IPPacket;
void print_packet(IPPacket *pkt) {
if (pkt->version == 4) {
printf("IPv4 Packet:\n");
printf("  Source: %d.%d.%d.%d\n",
(pkt->header.ipv4.source_ip >> 24) & 0xFF,
(pkt->header.ipv4.source_ip >> 16) & 0xFF,
(pkt->header.ipv4.source_ip >> 8) & 0xFF,
pkt->header.ipv4.source_ip & 0xFF);
printf("  Dest:   %d.%d.%d.%d\n",
(pkt->header.ipv4.dest_ip >> 24) & 0xFF,
(pkt->header.ipv4.dest_ip >> 16) & 0xFF,
(pkt->header.ipv4.dest_ip >> 8) & 0xFF,
pkt->header.ipv4.dest_ip & 0xFF);
printf("  Protocol: %d\n", pkt->header.ipv4.protocol);
} else if (pkt->version == 6) {
printf("IPv6 Packet:\n");
printf("  Source: ");
for (int i = 0; i < 16; i += 2) {
printf("%02x%02x", 
pkt->header.ipv6.source_ip[i],
pkt->header.ipv6.source_ip[i+1]);
if (i < 14) printf(":");
}
printf("\n");
printf("  Next Header: %d\n", pkt->header.ipv6.next_header);
}
}
int main() {
IPPacket packet;
// Create an IPv4 packet
packet.version = 4;
packet.header.ipv4.source_ip = 0xC0A80164;  // 192.168.1.100
packet.header.ipv4.dest_ip = 0xC0A80101;    // 192.168.1.1
packet.header.ipv4.protocol = 6;             // TCP
packet.header.ipv4.ttl = 64;
print_packet(&packet);
// Reuse same memory for IPv6 packet
packet.version = 6;
// Set IPv6 source address (2001:db8::1)
memset(packet.header.ipv6.source_ip, 0, 16);
packet.header.ipv6.source_ip[0] = 0x20;
packet.header.ipv6.source_ip[1] = 0x01;
packet.header.ipv6.source_ip[2] = 0x0D;
packet.header.ipv6.source_ip[3] = 0xB8;
packet.header.ipv6.source_ip[15] = 0x01;
packet.header.ipv6.next_header = 6;  // TCP
printf("\n");
print_packet(&packet);
return 0;
}

Example 4: Register Bit Fields

#include <stdio.h>
#include <stdint.h>
// Hardware register representation
typedef union {
struct {
uint8_t enable   : 1;   // Bit 0
uint8_t mode     : 2;   // Bits 1-2
uint8_t status   : 2;   // Bits 3-4
uint8_t reserved : 3;   // Bits 5-7
} bits;
uint8_t value;               // Full 8-bit register
} ControlRegister;
void print_register(ControlRegister reg) {
printf("Register value: 0x%02X\n", reg.value);
printf("  Enable: %d\n", reg.bits.enable);
printf("  Mode: %d\n", reg.bits.mode);
printf("  Status: %d\n", reg.bits.status);
printf("  Reserved: %d\n", reg.bits.reserved);
}
int main() {
ControlRegister reg = {0};
// Set individual bits
reg.bits.enable = 1;
reg.bits.mode = 2;  // Binary 10
reg.bits.status = 1; // Binary 01
printf("After setting bits:\n");
print_register(reg);
// Access as full register
reg.value = 0x3A;  // Binary 00111010
printf("\nAfter setting full value:\n");
print_register(reg);
return 0;
}

Output:

After setting bits:
Register value: 0x09
Enable: 1
Mode: 2
Status: 1
Reserved: 0
After setting full value:
Register value: 0x3A
Enable: 0
Mode: 2
Status: 3
Reserved: 0

Example 5: Endianness Detection

#include <stdio.h>
#include <stdint.h>
typedef union {
uint32_t word;
uint8_t bytes[4];
} EndianTest;
int main() {
EndianTest test;
test.word = 0x01020304;
printf("Memory layout of 0x01020304:\n");
printf("Byte 0 (lowest address): 0x%02X\n", test.bytes[0]);
printf("Byte 1: 0x%02X\n", test.bytes[1]);
printf("Byte 2: 0x%02X\n", test.bytes[2]);
printf("Byte 3 (highest address): 0x%02X\n", test.bytes[3]);
if (test.bytes[0] == 0x01) {
printf("\nSystem is big-endian\n");
} else if (test.bytes[0] == 0x04) {
printf("\nSystem is little-endian\n");
} else {
printf("\nSystem is mixed-endian (unlikely)\n");
}
return 0;
}

Output on little-endian system:

Memory layout of 0x01020304:
Byte 0 (lowest address): 0x04
Byte 1: 0x03
Byte 2: 0x02
Byte 3 (highest address): 0x01
System is little-endian

Unions and Byte Order

Unions are useful for understanding and manipulating byte order:

#include <stdio.h>
#include <stdint.h>
typedef union {
uint16_t word;
struct {
uint8_t low;
uint8_t high;
} bytes;
} Word16;
typedef union {
uint32_t dword;
struct {
uint16_t low;
uint16_t high;
} words;
uint8_t bytes[4];
} DWord32;
int main() {
Word16 w;
w.word = 0x1234;
printf("16-bit word: 0x%04X\n", w.word);
printf("Low byte: 0x%02X, High byte: 0x%02X\n", 
w.bytes.low, w.bytes.high);
DWord32 dw;
dw.dword = 0x12345678;
printf("\n32-bit dword: 0x%08X\n", dw.dword);
printf("Low word: 0x%04X, High word: 0x%04X\n", 
dw.words.low, dw.words.high);
printf("Bytes: 0x%02X 0x%02X 0x%02X 0x%02X\n",
dw.bytes[0], dw.bytes[1], dw.bytes[2], dw.bytes[3]);
return 0;
}

Anonymous Unions (C11)

C11 introduced anonymous unions, which can be used within structures without naming the union:

#include <stdio.h>
typedef struct {
int type;
union {  // Anonymous union
int int_value;
double double_value;
char *string_value;
};  // No union name
} Variant;
int main() {
Variant v;
v.type = 1;
v.int_value = 42;  // Direct access, no union name
printf("Value: %d\n", v.int_value);
printf("Size: %lu\n", sizeof(v));
return 0;
}

Unions with Complex Types

#include <stdio.h>
#include <complex.h>
typedef union {
struct {
double real;
double imag;
} cartesian;
struct {
double magnitude;
double phase;
} polar;
double complex complex_num;
} ComplexNumber;
int main() {
ComplexNumber z;
// Set as Cartesian coordinates
z.cartesian.real = 3.0;
z.cartesian.imag = 4.0;
printf("Cartesian: %.1f + %.1fi\n", 
z.cartesian.real, z.cartesian.imag);
// Access as complex type
double complex c = z.complex_num;
printf("Complex: %.1f + %.1fi\n", creal(c), cimag(c));
// Magnitude and phase
z.polar.magnitude = 5.0;
z.polar.phase = 0.9273;  // arctan(4/3)
printf("\nPolar: mag=%.1f, phase=%.4f rad\n",
z.polar.magnitude, z.polar.phase);
return 0;
}

Memory Alignment and Padding

Unions have unique alignment requirements:

#include <stdio.h>
#include <stdalign.h>
typedef struct {
char c;
int i;
double d;
} MixedStruct;
typedef union {
char c;
int i;
double d;
} MixedUnion;
int main() {
printf("Structure alignment:\n");
printf("  Size: %lu\n", sizeof(MixedStruct));
printf("  Alignment: %lu\n", alignof(MixedStruct));
printf("  Offset of c: %lu\n", offsetof(MixedStruct, c));
printf("  Offset of i: %lu\n", offsetof(MixedStruct, i));
printf("  Offset of d: %lu\n", offsetof(MixedStruct, d));
printf("\nUnion alignment:\n");
printf("  Size: %lu\n", sizeof(MixedUnion));
printf("  Alignment: %lu\n", alignof(MixedUnion));
printf("  All members share offset 0\n");
return 0;
}

Practical Use Cases

1. Serialization/Deserialization

#include <stdio.h>
#include <stdint.h>
#include <string.h>
typedef union {
uint8_t bytes[8];
uint64_t value;
struct {
uint32_t low;
uint32_t high;
} parts;
} SerializedData;
void serialize_to_bytes(uint64_t value, uint8_t buffer[8]) {
SerializedData data;
data.value = value;
memcpy(buffer, data.bytes, 8);
}
uint64_t deserialize_from_bytes(uint8_t buffer[8]) {
SerializedData data;
memcpy(data.bytes, buffer, 8);
return data.value;
}
int main() {
uint64_t original = 0x123456789ABCDEF0;
uint8_t buffer[8];
serialize_to_bytes(original, buffer);
printf("Serialized bytes: ");
for (int i = 0; i < 8; i++) {
printf("%02X ", buffer[i]);
}
printf("\n");
uint64_t recovered = deserialize_from_bytes(buffer);
printf("Recovered: 0x%016lX\n", recovered);
return 0;
}

2. Command Processing System

#include <stdio.h>
#include <string.h>
typedef enum {
CMD_NOP,
CMD_MOVE,
CMD_DRAW,
CMD_COLOR
} CommandType;
typedef struct {
CommandType type;
union {
struct {
int x;
int y;
} move;
struct {
int x1, y1;
int x2, y2;
} draw;
struct {
int red;
int green;
int blue;
} color;
};
} Command;
void execute_command(Command cmd) {
switch (cmd.type) {
case CMD_NOP:
printf("No operation\n");
break;
case CMD_MOVE:
printf("Move to (%d, %d)\n", cmd.move.x, cmd.move.y);
break;
case CMD_DRAW:
printf("Draw line from (%d,%d) to (%d,%d)\n",
cmd.draw.x1, cmd.draw.y1,
cmd.draw.x2, cmd.draw.y2);
break;
case CMD_COLOR:
printf("Set color to RGB(%d,%d,%d)\n",
cmd.color.red, cmd.color.green, cmd.color.blue);
break;
}
}
int main() {
Command commands[4];
commands[0].type = CMD_MOVE;
commands[0].move.x = 100;
commands[0].move.y = 200;
commands[1].type = CMD_COLOR;
commands[1].color.red = 255;
commands[1].color.green = 128;
commands[1].color.blue = 0;
commands[2].type = CMD_DRAW;
commands[2].draw.x1 = 100;
commands[2].draw.y1 = 200;
commands[2].draw.x2 = 300;
commands[2].draw.y2 = 400;
commands[3].type = CMD_NOP;
for (int i = 0; i < 4; i++) {
printf("Command %d: ", i + 1);
execute_command(commands[i]);
}
return 0;
}

Common Pitfalls

1. Reading the Wrong Member

#include <stdio.h>
union BadExample {
int i;
float f;
};
int main() {
union BadExample u;
u.f = 3.14;
// DANGER: Reading as wrong type
printf("As float: %f\n", u.f);      // OK - last written
printf("As int: %d\n", u.i);        // Garbage - undefined behavior
printf("As int (hex): 0x%08X\n", u.i); // Will show IEEE 754 representation
return 0;
}

2. Forgetting About Type Tracking

#include <stdio.h>
union Data {
int i;
float f;
char str[20];
};
// WRONG - no type tracking
void printData(union Data *d) {
// How do we know what type to print?
// printf("%d\n", d->i);  // Might be wrong!
}
// CORRECT - use tagged union pattern
typedef struct {
int type;  // 0=int, 1=float, 2=string
union Data data;
} SafeData;

3. Platform Dependencies

#include <stdio.h>
#include <stdint.h>
union PlatformDependent {
uint32_t u32;
struct {
uint16_t low;
uint16_t high;
} u16;
};
int main() {
union PlatformDependent pd;
pd.u32 = 0x12345678;
// This depends on endianness!
printf("On this platform: low=0x%04X, high=0x%04X\n",
pd.u16.low, pd.u16.high);
return 0;
}

Best Practices

  1. Use tagged unions - Always track what type is currently stored
  2. Initialize before use - Set at least one member before reading
  3. Document the active member - Make it clear which member is valid
  4. Consider alignment - Unions align to the strictest member
  5. Be careful with strings - Ensure strings fit in the union
  6. Use for memory saving - Only when members are mutually exclusive
  7. Consider portability - Endianness and padding can affect behavior

Union vs Structure Decision Guide

Use Union WhenUse Structure When
Members are mutually exclusiveMembers can be used simultaneously
Memory is criticalMemory is not a constraint
Implementing variant typesGrouping related data
Hardware registersDatabase records
Protocol packetsComplex objects
Type punning (carefully)Normal data aggregation

Common Mistakes Checklist

  • [ ] Reading a member that wasn't last written
  • [ ] Forgetting to track the active type
  • [ ] Assuming size without checking
  • [ ] Platform-dependent behavior (endianness)
  • [ ] String overflow in union members
  • [ ] Alignment assumptions
  • [ ] Using unions for type punning (strict aliasing violation)

Conclusion

Unions are a powerful feature in C that provide a way to store different data types in the same memory location. They're essential for:

  • Memory-constrained systems (embedded, IoT)
  • Variant data types (flexible data structures)
  • Protocol implementations (network packets)
  • Hardware interfaces (register mapping)
  • Type punning (reinterpreting binary data)

Key principles to remember:

  • All members share the same memory
  • Size equals largest member
  • Only one member can be active at a time
  • Always track the active member (tagged union pattern)
  • Be aware of platform dependencies

While unions can be tricky to use correctly, they're invaluable in systems programming where memory efficiency and flexible data representation are critical. Mastery of unions, combined with careful type tracking, allows you to write more efficient and elegant C code.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper