ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Probability and Statistics
    MS-251
    Progress0 / 36 topics
    Topics
    1. Introduction: Statistics and Data Analysis2. Statistical Inference3. Samples, Populations, and the Role of Probability4. Sampling Procedures5. Discrete and Continuous Data6. Statistical Modeling7. Types of Statistical Studies8. Probability: Sample Space, Events, Counting Sample Points9. Probability of an Event10. Additive Rules11. Conditional Probability12. Independence and the Product Rule13. Bayes’ Rule14. Random Variables and Probability Distributions15. Mathematical Expectation: Mean of a Random Variable16. Variance and Covariance of Random Variables17. Means and Variances of Linear Combinations of Random Variables18. Chebyshev’s Theorem19. Discrete Probability Distributions20. Continuous Probability Distributions21. Fundamental Sampling Distributions22. Sampling Distributions and Data Descriptions23. Random Sampling24. Sampling Distributions25. Sampling Distribution of Means and the Central Limit Theorem26. Sampling Distribution of S227. t-Distribution28. F-Quantile and Probability Plots29. Single Sample & One- and Two-Sample Estimation Problems30. Single Sample & One- and Two-Sample Tests of Hypotheses31. The Use of P-Values for Decision Making in Testing Hypotheses32. Regression: Linear Regression and Correlation33. Least Squares and the Fitted Model34. Multiple Linear Regression and Certain Nonlinear Regression Models35. Linear Regression Model Using Matrices36. Properties of the Least Squares Estimators
    MS-251›Multiple Linear Regression and Certain Nonlinear Regression Models
    Probability and StatisticsTopic 34 of 36

    Multiple Linear Regression and Certain Nonlinear Regression Models

    10 minread
    1,705words
    Intermediatelevel

    Multiple Linear Regression and Certain Nonlinear Regression Models

    1. Introduction to Multiple Linear Regression

    Multiple Linear Regression (MLR) is an extension of simple linear regression that allows us to model the relationship between a dependent variable (also called the response variable) and two or more independent variables (predictors or explanatory variables). It assumes that the relationship between the dependent variable and each independent variable is linear, but there can be multiple predictors involved.

    The primary goal of multiple linear regression is to find the best-fitting linear relationship between the dependent variable and the independent variables. It is widely used in many fields, such as economics, engineering, and social sciences.

    Multiple Linear Regression Model

    The general form of the multiple linear regression model is:

    y=β0+β1x1+β2x2+⋯+βkxk+ϵy = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_k x_k + \epsilony=β0​+β1​x1​+β2​x2​+⋯+βk​xk​+ϵ

    Where:

    • yyy is the dependent variable (response variable),
    • x1,x2,…,xkx_1, x_2, \dots, x_kx1​,x2​,…,xk​ are the independent variables (predictors),
    • β0\beta_0β0​ is the intercept (constant term),
    • β1,β2,…,βk\beta_1, \beta_2, \dots, \beta_kβ1​,β2​,…,βk​ are the regression coefficients for each predictor,
    • ϵ\epsilonϵ is the error term (residual), which accounts for the variation in yyy that cannot be explained by the predictors.

    Key Assumptions of Multiple Linear Regression:

    For the multiple linear regression model to provide reliable estimates, several assumptions must be met:

    1. Linearity: The relationship between the dependent variable and the independent variables is linear.
    2. Independence: The residuals (errors) are independent of each other.
    3. Homoscedasticity: The variance of the residuals is constant across all levels of the independent variables.
    4. Normality of Residuals: The residuals should be normally distributed (important for hypothesis testing).
    5. No Multicollinearity: The independent variables should not be highly correlated with each other.

    Fitting a Multiple Linear Regression Model

    The coefficients β0,β1,…,βk\beta_0, \beta_1, \dots, \beta_kβ0​,β1​,…,βk​ are estimated using the method of Ordinary Least Squares (OLS), which minimizes the sum of squared residuals. Mathematically, the goal is to find the coefficients that minimize:

    Sum of squared residuals=∑i=1n(yi−yi^)2\text{Sum of squared residuals} = \sum_{i=1}^{n} (y_i - \hat{y_i})^2Sum of squared residuals=i=1∑n​(yi​−yi​^​)2

    Where yi^\hat{y_i}yi​^​ represents the predicted values of yiy_iyi​.

    Interpretation of Coefficients

    • Intercept (β0\beta_0β0​): This is the expected value of yyy when all independent variables are equal to zero.
    • Regression Coefficients (β1,β2,…,βk\beta_1, \beta_2, \dots, \beta_kβ1​,β2​,…,βk​): Each coefficient represents the change in the dependent variable yyy for a one-unit change in the corresponding independent variable, while holding the other variables constant. For example, if β1=2\beta_1 = 2β1​=2, it means that for each unit increase in x1x_1x1​, yyy increases by 2 units, assuming all other predictors are held constant.

    2. Assumptions and Diagnostics in Multiple Linear Regression

    Once the model is fit, it is important to check the assumptions to validate the model's findings:

    1. Linearity: You can plot the residuals versus the fitted values to check if there is a linear pattern. A linear pattern indicates that the linearity assumption is met.

    2. Independence: If residuals are correlated, the assumption of independence is violated. This is often checked using the Durbin-Watson statistic.

    3. Homoscedasticity: Plotting residuals versus fitted values should show no clear pattern. If the spread of residuals increases or decreases as fitted values increase, this indicates heteroscedasticity.

    4. Normality of Residuals: This can be assessed using a Q-Q plot or histogram of residuals. If the residuals are normally distributed, the points will lie along a straight line in the Q-Q plot.

    5. Multicollinearity: This occurs when two or more independent variables are highly correlated with each other, making it difficult to isolate the individual effect of each variable on the dependent variable. You can check for multicollinearity using the Variance Inflation Factor (VIF).


    3. Nonlinear Regression Models

    While multiple linear regression assumes a linear relationship between the dependent and independent variables, some relationships are inherently nonlinear. In such cases, nonlinear regression models are used.

    Nonlinear regression is used when the relationship between the dependent and independent variables cannot be described by a straight line but instead follows some nonlinear function (such as exponential, logarithmic, power, or polynomial).

    Types of Nonlinear Regression Models

    1. Exponential Regression Model: The dependent variable yyy changes exponentially with respect to the independent variable xxx:

      y=β0eβ1xy = \beta_0 e^{\beta_1 x}y=β0​eβ1​x

      Here, eee is the base of the natural logarithm.

    2. Logarithmic Regression Model: The relationship between yyy and xxx follows a logarithmic function:

      y=β0+β1ln⁡(x)y = \beta_0 + \beta_1 \ln(x)y=β0​+β1​ln(x)

      This is useful when growth or decay is observed, and changes in yyy are proportional to the logarithm of xxx.

    3. Power Law Model: The dependent variable yyy follows a power of the independent variable xxx:

      y=β0xβ1y = \beta_0 x^{\beta_1}y=β0​xβ1​

      This is common in situations where relationships are proportional to a power of the independent variable, such as certain physical laws.

    4. Polynomial Regression: A more flexible nonlinear model where the relationship between yyy and xxx is modeled as a polynomial of degree nnn:

      y=β0+β1x+β2x2+⋯+βnxny = \beta_0 + \beta_1 x + \beta_2 x^2 + \dots + \beta_n x^ny=β0​+β1​x+β2​x2+⋯+βn​xn

      This allows for modeling curvatures in the relationship between the variables, but care must be taken to avoid overfitting, especially with higher-degree polynomials.

    5. Logistic Regression (for binary outcomes): When the dependent variable is binary (e.g., success/failure, yes/no), a logistic regression model is used, which models the probability of success as a nonlinear function of the predictors. It is defined as:

      P(y=1)=11+e−(β0+β1x1+β2x2+⋯+βkxk)P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_k x_k)}}P(y=1)=1+e−(β0​+β1​x1​+β2​x2​+⋯+βk​xk​)1​

      This model is often used for classification problems.

    Fitting Nonlinear Regression Models

    Fitting nonlinear regression models typically involves nonlinear optimization techniques since the relationship between the dependent and independent variables is not linear. These techniques include methods like Gauss-Newton, Levenberg-Marquardt, and gradient descent. These methods iteratively adjust the parameters (β0,β1,…\beta_0, \beta_1, \dotsβ0​,β1​,…) to minimize the sum of squared residuals.

    Unlike linear regression, where the coefficients can be directly computed using matrix algebra (Ordinary Least Squares), nonlinear regression often requires computational methods to estimate the coefficients.


    4. Comparing Linear and Nonlinear Regression Models

    Aspect Linear Regression Nonlinear Regression
    Relationship Linear between yyy and xxx Nonlinear between yyy and xxx
    Model Form y=β0+β1xy = \beta_0 + \beta_1 xy=β0​+β1​x y=β0eβ1xy = \beta_0 e^{\beta_1 x}y=β0​eβ1​x, y=β0xβ1y = \beta_0 x^{\beta_1}y=β0​xβ1​, etc.
    Fitting Method Ordinary Least Squares (OLS) Nonlinear optimization (e.g., Gauss-Newton)
    Assumptions Linearity, homoscedasticity, independence, etc. More flexible; assumptions depend on the specific model
    Interpretation Coefficients represent the change in yyy per unit change in xxx Coefficients depend on the form of the nonlinear function

    5. Summary

    • Multiple Linear Regression models the relationship between a dependent variable and multiple independent variables using a linear equation. It is widely used for prediction and inference, provided that the assumptions of linearity, independence, homoscedasticity, and normality of residuals hold.
    • Nonlinear Regression Models are used when the relationship between the dependent and independent variables is not linear. These models can take various forms (e.g., exponential, logarithmic, polynomial), and fitting them requires nonlinear optimization techniques.

    Both linear and nonlinear regression are powerful tools for modeling relationships in data, and choosing between them depends on the nature of the data and the relationship between the variables.

    Previous topic 33
    Least Squares and the Fitted Model
    Next topic 35
    Linear Regression Model Using Matrices

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time10 min
      Word count1,705
      Code examples0
      DifficultyIntermediate