In Computer Organization and Assembly Language, a heterogeneous data structure refers to a data structure that can store elements of different data types. Unlike arrays, where all elements are of the same type (homogeneous), heterogeneous data structures can hold items of varying types and sizes, allowing greater flexibility for organizing complex data.
Common examples of heterogeneous data structures include records (often called structures in C/C++) and unions. These structures are often used to model real-world objects and handle data of mixed types, such as integers, floating-point numbers, strings, and pointers, within a single entity.
A record (or structure) is a collection of elements, called fields, which can each be of a different data type. Each field in the record can hold a different type of data, and each field has a unique name.
Consider a simple record that represents a person, containing an integer for age, a floating-point number for height, and a string for the name.
section .data
person:
db 0 ; Age (1 byte)
dd 0.0 ; Height (4 bytes for float)
db 20, 0 ; Name (string with maximum 20 characters)
Here:
person record contains three fields: age, height, and name.db (define byte) is used for age and the name string, while dd (define double word) is used for height as a floating-point number (assuming 32-bit float).In this case:
db to store each character).To access or modify the fields of a record, you need to calculate the offset of each field from the base address of the record.
section .data
person:
db 25 ; Age (1 byte)
dd 5.9 ; Height (4 bytes for float)
db 'John Doe', 0 ; Name (string terminated with 0)
section .text
global _start
_start:
; Address of the "person" record is already available
; Modify the "age" field to 30
MOV BYTE [person], 30 ; Set age to 30
; Access the "height" field
MOV EAX, [person + 1] ; Load height into EAX (skip the first byte for age)
; Exit program
MOV EAX, 1 ; SYS_exit system call
MOV EBX, 0 ; Exit status
INT 0x80 ; Exit the program
In this example:
person record and modify its age field.1 byte offset and using MOV EAX, [person + 1].A union is a special data structure that allows different data types to occupy the same memory space. Unlike a record, which allocates space for each field separately, a union allows multiple fields to share the same memory location, so the size of the union is determined by the largest field.
For example, a union could store either an integer, a float, or a string, but never all three at the same time.
section .data
myUnion:
dd 0 ; Integer field (4 bytes)
db 0.0 ; Float field (4 bytes, as a placeholder for the float)
db 'Hello', 0 ; String field (5 bytes)
In this case:
myUnion structure has three fields, but the total size of the union will be determined by the largest field. If the string field is the largest (because it is a sequence of characters), the union will have a size equal to the largest field size, which in this case would be 5 bytes (for the string, since db is used for the string field).Since all fields in a union share the same memory location, you can access them in the same way, but you must be careful to use the correct data type for the operation.
section .data
myUnion:
dd 42 ; Integer field initialized to 42
section .text
global _start
_start:
; Access the integer field of the union
MOV EAX, [myUnion] ; Load the integer value (42) into EAX
; Modify the integer field
MOV DWORD [myUnion], 100 ; Change the integer value to 100
; Access the string field (after modifying the integer)
; String 'Hello' is now stored at the same location
MOV EBX, myUnion ; Load the address of the string into EBX
MOV ECX, [EBX] ; Load the first word of the string (pointer)
; Exit program
MOV EAX, 1 ; SYS_exit system call
MOV EBX, 0 ; Exit status
INT 0x80 ; Exit the program
In this example:
myUnion value will retrieve the integer value (or any other field, depending on the program's context).myUnion also changes the memory location, potentially affecting other fields in the union.Pointers are essential for working with heterogeneous data structures because they allow indirect access to fields of records and unions, and they can point to dynamic memory locations.
section .data
person:
db 30 ; Age (1 byte)
dd 5.9 ; Height (4 bytes)
db 'Alice', 0 ; Name (string)
pointer_to_person dd person ; Pointer to "person"
section .text
global _start
_start:
; Access the "age" field via the pointer
MOV EBX, [pointer_to_person] ; Load the pointer to the record into EBX
MOV AL, [EBX] ; Load the age into AL (1 byte)
; Modify the "height" field via the pointer
MOV EBX, [pointer_to_person] ; Reload the pointer to the record
MOV DWORD [EBX + 1], 6.2 ; Modify the height (4 bytes offset from age)
; Exit program
MOV EAX, 1 ; SYS_exit system call
MOV EBX, 0 ; Exit status
INT 0x80 ; Exit the program
In this example:
pointer_to_person holds the memory address of the person record.Heterogeneous data structures (like records/structures and unions) provide a way to store and manage data of different types within a single data structure.
Open this section to load past papers