ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Computer Organization and Assembly Language
    COMP2118
    Progress0 / 35 topics
    Topics
    1. Introduction to Computer Systems2. Information is Bits + Context3. Programs are Translated by Other Programs4. Understanding Compilation Systems5. Processors Read and Interpret Instructions6. Caches Matter7. Storage Devices Form a Hierarchy8. The Operating System Manages the Hardware9. Systems Communicate Using Networks10. Representing and Manipulating Information11. Information Storage12. Integer Representations13. Integer Arithmetic14. Floating Point15. Machine-Level Representation of Programs16. A Historical Perspective17. Program Encodings18. Data Formats19. Accessing Information20. Arithmetic and Logical Operations21. Control22. Procedures23. Array Allocation and Access24. Heterogeneous Data Structures25. Understanding Pointers26. Using the GDB Debugger27. Out-of-Bounds Memory References and Buffer Overflow28. x86-64: Extending IA-32 to 64 Bits29. Machine-Level Representations of Floating-Point Programs30. Processor Architecture31. The Y86 Instruction Set Architecture32. Logic Design and the Hardware Control Language (HCL)33. Sequential Y86 Implementations34. General Principles of Pipelining35. Pipelined Y86 Implementations
    COMP2118›Machine-Level Representations of Floating-Point Programs
    Computer Organization and Assembly LanguageTopic 29 of 35

    Machine-Level Representations of Floating-Point Programs

    3 minread
    590words
    Beginnerlevel

    Machine-Level Representations of Floating-Point Programs

    Floating-point representation is a way of encoding real numbers in a computer using binary. Floating-point programs use this representation to handle calculations involving non-integer values, such as decimals and fractions. At the machine level, floating-point operations are managed by a component called the Floating-Point Unit (FPU).


    1. Floating-Point Number Representation

    A floating-point number is generally represented using three components:

    1. Sign (S): Determines if the number is positive (0) or negative (1).
    2. Exponent (E): Represents the scaling factor by raising 2 to a power.
    3. Mantissa (M) (or Fraction): Represents the significant digits of the number.

    The general formula is:

    Value=(−1)S×M×2(E−Bias)\text{Value} = (-1)^{S} \times M \times 2^{(E - \text{Bias})}Value=(−1)S×M×2(E−Bias)

    IEEE 754 Standard

    The IEEE 754 standard defines how floating-point numbers are represented and processed. Two common formats are:

    • Single Precision (32-bit):
      • 1 bit for the sign
      • 8 bits for the exponent
      • 23 bits for the mantissa
    • Double Precision (64-bit):
      • 1 bit for the sign
      • 11 bits for the exponent
      • 52 bits for the mantissa

    2. Floating-Point Operations

    Floating-point programs involve various operations such as addition, subtraction, multiplication, division, and square roots. At the machine level:

    • Floating-point operations are implemented using dedicated FPU instructions.
    • These instructions follow specific rules to handle rounding, overflow, underflow, and precision.

    Example Instructions (x86 Assembly):

    • Addition: addss (single precision), addsd (double precision)
    • Multiplication: mulss, mulsd
    • Division: divss, divsd
    • Square Root: sqrtss, sqrtsd

    3. Floating-Point Arithmetic Challenges

    1. Precision Errors:

      • Floating-point numbers have limited precision because only a finite number of bits are available for the mantissa.
      • Example: The result of 0.1+0.20.1 + 0.20.1+0.2 might not exactly equal 0.30.30.3 due to binary rounding.
    2. Rounding Modes:

      • IEEE 754 supports rounding modes like round-to-nearest, round-toward-zero, round-up, and round-down.
    3. Overflow and Underflow:

      • Overflow: When a number is too large to represent.
      • Underflow: When a number is too small (close to zero) to represent.
    4. Special Values:

      • NaN (Not a Number): Result of undefined operations like 00\frac{0}{0}00​.
      • Infinity: Represented by maximum exponent and zero mantissa.

    4. Floating-Point Representation in Programs

    When writing programs in high-level languages, the compiler translates floating-point calculations into machine-level instructions. For example:

    float x = 1.5, y = 2.5, z;
    z = x + y;
    

    The compiler generates machine code that performs the addition using FPU instructions.

    Assembly Representation:

    movss xmm0, DWORD PTR [x]    ; Load x into register
    movss xmm1, DWORD PTR [y]    ; Load y into register
    addss xmm0, xmm1             ; Add x and y, store result in xmm0
    movss DWORD PTR [z], xmm0    ; Store result into z
    

    5. Performance and Optimization

    • Hardware Support: Modern CPUs have highly optimized FPUs for efficient floating-point operations.
    • Vectorization: SIMD (Single Instruction, Multiple Data) instructions like AVX enable parallel floating-point operations.
    • Compiler Optimizations: Compilers rearrange and optimize floating-point instructions to improve performance, but this can sometimes alter precision.

    6. Summary Table: Machine-Level Floating-Point Programs

    Concept Details
    Floating-Point Components Sign, Exponent, Mantissa (IEEE 754 standard).
    Precision Formats Single Precision (32-bit), Double Precision (64-bit).
    FPU Instructions addss, addsd, mulss, mulsd, sqrtss, sqrtsd.
    Challenges Precision errors, rounding modes, overflow, underflow, special values (NaN, Infinity).
    High-Level to Machine Compilers translate floating-point operations into FPU instructions.
    Optimization SIMD, AVX instructions, and compiler techniques for better performance.
    Previous topic 28
    x86-64: Extending IA-32 to 64 Bits
    Next topic 30
    Processor Architecture

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time3 min
      Word count590
      Code examples0
      DifficultyBeginner