3
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score

# veri yukleme
veriler = pd.read_csv(r'C:\Users\k\Desktop\maaslar_yeni.csv')
# x burada bagımsız degısken y ise bagımlı degiskendir.
x = veriler.iloc[:,2:3]
y = veriler.iloc[:,5:]
X=x.values.reshape(-1,1)
Y=y.values.reshape(-1,1)


#linear regression
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X,Y)

import statsmodels.api as sm
model=sm.OLS(lin_reg.predict(X),X)
print(model.fit().summary())

In a tutorial, the instructor just changed x and by making new X,Y. Then he tries to use LinearRegression as you see in the picture for learning the R value. But when I try this I see the above error. How can I solve this?

Peter
  • 7,446
  • 5
  • 19
  • 49

1 Answers1

0

Your code has a potential error

x = veriler.iloc[:,2:3]
y = veriler.iloc[:,5:]
X=x.values.reshape(-1,1)
Y=y.values.reshape(-1,1)

As you can see, you x has ONE column (2:3), but your y may have more than one (5: means 6th column to the end). As a result, when you do the reshape, then len(Y) might be two times (or more) than len(X).

Please post few records of you data to help you more; but generally, X can have multiple columns and Y has one column. However, if you are going to do the regression for 1 column X and Y, the possible fix for your code is as below (remove the : for your y)

y = veriler.iloc[:,5]
Mehdi
  • 324
  • 1
  • 6