Simple Linear Regression


Reference Model
  • A prototype model that displays the most basic performance that becomes a reference for the prediction model
  • Types:
    - Classification = Mode of the Target
  • Regression = Mean of the Target
  • Time-Series Regression = The value of the prior time-stamp
  • Regression Line/Shape

  • Residual = The difference between predicted value and observed value

  • Error = The difference between predicted value and true value (of population)

  • Line of Regression = Residual Sum of Squares (RSS) -- the line that minimizes RSS (also called Sum of Squared Errors)

  • Least Squared Method = Used to find the slope/intercept of the linear regression
  • Variables:
  • x = independent variable/feature
  • y = dependent variable/target
  • Linear Regression Model using Scikit-learn
    from sklearn.linear_model import Linear Regression
    
    model = LinearRegression()
    
    feature = [x]
    target = [y]
    X_train = df[feature]
    y_train = df[target]
    
    model.fit(X_train, y_train)
    
    X_test = [test] #example
    y_pred = model.predict(X_test)
    
    y_pred
    
    Coefficients:
  • Coefficient of Slope: model.coef_
  • Coefficient of Intercept: model.intercept_