Unions
A Union in Cyrus is a low-level composite type where all member fields share the exact same base memory address. Unlike a struct, which allocates distinct offsets for each field, a union provides a way to interpret a single block of raw memory as multiple different types.
The size of a union is determined by the size of its largest member. Writing to any field overlaps and overwrites the memory occupied by all other fields, effectively providing a mechanism for memory aliasing.
Unions Are Not Memory-Safe
Unions in Cyrus are not memory-safe. Reading a field that was not most recently written is undefined behavior, just like in C. Use unions only when you explicitly need memory-efficient, low-level data representations.
Safe Alternative
If you want a memory-safe way to store different types of values, use an enum, which enforces at compile time which variant is active.
Defining a Union
union DataUnion {
a: int;
b: float64;
}
fn main() {
var raw: DataUnion;
raw.b = 3.14;
}
Using a Union
You can create a union instance and assign fields directly:
fn main() {
var raw = DataUnion;
raw.a = 42; // set the integer field
raw.b = 3.14; // overwrites the same memory with a float
}
After raw.b = 3.14, the value of raw.a is no longer valid.
Union Initialization
Unions can be initialized using a Union Initializer, specifying which field to set at creation:
var un: DataUnion = DataUnion { a: 10 };
Rules:
- Only one field should be initialized.
- The union's memory will be set according to that field.
Practical Use Cases
Unions are low-level tools, mainly used in systems programming:
- Type punning: reinterpret the same memory as different types.
- Interfacing with C libraries: many C APIs expose unions in their structs.
- Memory efficiency: when you know only one of several large fields will be used at once.
Example: Interpreting the same 32-bit data as either an integer or raw bytes.
import std::libc{printf};
union IntBytes {
value: int;
bytes: uint8[4];
}
fn main() {
var data = IntBytes { value: 0x12345678 };
printf("%x %x %x %x\n", data.bytes[0], data.bytes[1], data.bytes[2], data.bytes[3]);
}
Output (on little-endian systems):
78 56 34 12
Unnamed Union Initialization
Similar to structs, you can use unnamed unions for inline data layout or initialization of named union types. This is particularly useful for temporary low-level buffers.
union Payload {
i: int64;
s: char*;
}
pub fn main() {
const layout: Payload = union { s: "Cyrus!" };
printf("%s\n", layout.s);
}
Only one field can be initialized in a union value. Providing multiple fields will result in a compile-time error.
Union Pointer Aliasing
Since every field in a union shares the same base address, taking a reference to a specific field provides a typed pointer to the union's shared memory block. This allows for pointer aliasing, where you can manipulate the union's raw data through pointers of different types.
This is a powerful feature for systems programming, enabling direct memory manipulation without explicit casting at every step.
union DataStore {
p: char*;
i: int64;
}
pub fn main() {
// Initialize the union via the pointer field
var inst = DataStore { p: null };
// Obtain a pointer to the integer field
// Both &inst.p and &inst.i point to the same memory address
var iptr: int64* = &inst.i;
// Indirectly modify the union memory through the aliased pointer
*iptr = 2500;
// The shared memory now holds the bit pattern of the integer 2500
printf("%d\n", inst.i);
}

