ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Computer Organization and Assembly Language
    COMP2118
    Progress0 / 35 topics
    Topics
    1. Introduction to Computer Systems2. Information is Bits + Context3. Programs are Translated by Other Programs4. Understanding Compilation Systems5. Processors Read and Interpret Instructions6. Caches Matter7. Storage Devices Form a Hierarchy8. The Operating System Manages the Hardware9. Systems Communicate Using Networks10. Representing and Manipulating Information11. Information Storage12. Integer Representations13. Integer Arithmetic14. Floating Point15. Machine-Level Representation of Programs16. A Historical Perspective17. Program Encodings18. Data Formats19. Accessing Information20. Arithmetic and Logical Operations21. Control22. Procedures23. Array Allocation and Access24. Heterogeneous Data Structures25. Understanding Pointers26. Using the GDB Debugger27. Out-of-Bounds Memory References and Buffer Overflow28. x86-64: Extending IA-32 to 64 Bits29. Machine-Level Representations of Floating-Point Programs30. Processor Architecture31. The Y86 Instruction Set Architecture32. Logic Design and the Hardware Control Language (HCL)33. Sequential Y86 Implementations34. General Principles of Pipelining35. Pipelined Y86 Implementations
    COMP2118›Heterogeneous Data Structures
    Computer Organization and Assembly LanguageTopic 24 of 35

    Heterogeneous Data Structures

    7 minread
    1,139words
    Intermediatelevel

    Heterogeneous Data Structures in Computer Organization and Assembly Language

    In Computer Organization and Assembly Language, a heterogeneous data structure refers to a data structure that can store elements of different data types. Unlike arrays, where all elements are of the same type (homogeneous), heterogeneous data structures can hold items of varying types and sizes, allowing greater flexibility for organizing complex data.

    Common examples of heterogeneous data structures include records (often called structures in C/C++) and unions. These structures are often used to model real-world objects and handle data of mixed types, such as integers, floating-point numbers, strings, and pointers, within a single entity.

    1. Records (Structures)

    A record (or structure) is a collection of elements, called fields, which can each be of a different data type. Each field in the record can hold a different type of data, and each field has a unique name.

    Example of a Record (Structure) in Assembly:

    Consider a simple record that represents a person, containing an integer for age, a floating-point number for height, and a string for the name.

    section .data
        person:
            db 0              ; Age (1 byte)
            dd 0.0            ; Height (4 bytes for float)
            db 20, 0          ; Name (string with maximum 20 characters)
    

    Here:

    • The person record contains three fields: age, height, and name.
    • The db (define byte) is used for age and the name string, while dd (define double word) is used for height as a floating-point number (assuming 32-bit float).

    In this case:

    • Age is represented by a single byte.
    • Height is represented as a floating-point number, typically 4 bytes.
    • Name is represented by a string (using db to store each character).

    Accessing and Modifying Record Fields

    To access or modify the fields of a record, you need to calculate the offset of each field from the base address of the record.

    Example (Accessing and Modifying the "Age" field)
    section .data
        person:
            db 25             ; Age (1 byte)
            dd 5.9            ; Height (4 bytes for float)
            db 'John Doe', 0  ; Name (string terminated with 0)
    
    section .text
        global _start
    
    _start:
        ; Address of the "person" record is already available
        ; Modify the "age" field to 30
        MOV BYTE [person], 30   ; Set age to 30
    
        ; Access the "height" field
        MOV EAX, [person + 1]   ; Load height into EAX (skip the first byte for age)
        
        ; Exit program
        MOV EAX, 1              ; SYS_exit system call
        MOV EBX, 0              ; Exit status
        INT 0x80                ; Exit the program
    

    In this example:

    • We access the person record and modify its age field.
    • The height is located 1 byte after the age field (age is 1 byte long), so we access it by moving 1 byte offset and using MOV EAX, [person + 1].

    2. Unions

    A union is a special data structure that allows different data types to occupy the same memory space. Unlike a record, which allocates space for each field separately, a union allows multiple fields to share the same memory location, so the size of the union is determined by the largest field.

    For example, a union could store either an integer, a float, or a string, but never all three at the same time.

    Example of a Union in Assembly:

    section .data
        myUnion:
            dd 0              ; Integer field (4 bytes)
            db 0.0            ; Float field (4 bytes, as a placeholder for the float)
            db 'Hello', 0     ; String field (5 bytes)
    

    In this case:

    • The myUnion structure has three fields, but the total size of the union will be determined by the largest field. If the string field is the largest (because it is a sequence of characters), the union will have a size equal to the largest field size, which in this case would be 5 bytes (for the string, since db is used for the string field).

    Accessing and Modifying Union Fields

    Since all fields in a union share the same memory location, you can access them in the same way, but you must be careful to use the correct data type for the operation.

    section .data
        myUnion:
            dd 42             ; Integer field initialized to 42
    
    section .text
        global _start
    
    _start:
        ; Access the integer field of the union
        MOV EAX, [myUnion]    ; Load the integer value (42) into EAX
        
        ; Modify the integer field
        MOV DWORD [myUnion], 100  ; Change the integer value to 100
        
        ; Access the string field (after modifying the integer)
        ; String 'Hello' is now stored at the same location
        MOV EBX, myUnion      ; Load the address of the string into EBX
        MOV ECX, [EBX]        ; Load the first word of the string (pointer)
    
        ; Exit program
        MOV EAX, 1            ; SYS_exit system call
        MOV EBX, 0            ; Exit status
        INT 0x80              ; Exit the program
    

    In this example:

    • Since all fields share the same memory location, accessing the myUnion value will retrieve the integer value (or any other field, depending on the program's context).
    • Modifying the integer value at myUnion also changes the memory location, potentially affecting other fields in the union.

    3. Pointers to Heterogeneous Data Structures

    Pointers are essential for working with heterogeneous data structures because they allow indirect access to fields of records and unions, and they can point to dynamic memory locations.

    Example (Using a Pointer to Access a Record)

    section .data
        person:
            db 30             ; Age (1 byte)
            dd 5.9            ; Height (4 bytes)
            db 'Alice', 0     ; Name (string)
    
        pointer_to_person dd person  ; Pointer to "person"
    
    section .text
        global _start
    
    _start:
        ; Access the "age" field via the pointer
        MOV EBX, [pointer_to_person]    ; Load the pointer to the record into EBX
        MOV AL, [EBX]                   ; Load the age into AL (1 byte)
    
        ; Modify the "height" field via the pointer
        MOV EBX, [pointer_to_person]    ; Reload the pointer to the record
        MOV DWORD [EBX + 1], 6.2        ; Modify the height (4 bytes offset from age)
    
        ; Exit program
        MOV EAX, 1                      ; SYS_exit system call
        MOV EBX, 0                      ; Exit status
        INT 0x80                        ; Exit the program
    

    In this example:

    • The pointer pointer_to_person holds the memory address of the person record.
    • We can access or modify the fields in the record using the pointer, with appropriate offsets based on the structure layout.

    4. Conclusion

    Heterogeneous data structures (like records/structures and unions) provide a way to store and manage data of different types within a single data structure.

    • Records allow for different types of fields and are typically used when you want to group related but different types of data.
    • Unions allow multiple fields to share the same memory location and are useful when you need to store one of several data types, but never all at once.
    • Both structures are widely used in assembly and low-level programming, especially when dealing with complex data models that require varied data types.
    Previous topic 23
    Array Allocation and Access
    Next topic 25
    Understanding Pointers

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time7 min
      Word count1,139
      Code examples0
      DifficultyIntermediate