博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Logistic Regression
阅读量:4933 次
发布时间:2019-06-11

本文共 6158 字,大约阅读时间需要 20 分钟。

可以从线性回归推到logistic regression.

首先介绍线性回归。

线性回归是非常基础的模型,它的几个假设有:

(1)假设对x的测量数据没有误差
(2)假设y的期望是系数和x的线性组合
(3)假设误差独立同分布,分布为\(N(0,\sigma^2)\)
(4)假设变量x之间没有多重共线性(即设计矩阵列满秩。设计矩阵的列数为x变量数,行数为样本数)(如果变量x之间有多重共线性,那么,利用最小二乘法求解问题,将导致解不稳定。针对这一问题,可以利用脊回归解决)

对于样本\({(x_1,y_1),(x_2,y_2),......,(x_p,y_p)}\),线性回归的学习模型是\[f(x_i) = wx_i+b\]损失函数使用均方误差\[\sum_{i = 1}^p(f(x_i)-y_i)^2\]然后基于损失函数最小化求解参数w和b。

接下来先讲一下 logistic function

\[y = \frac{1}{1+e^{-x}}\]

1228226-20170902111959249-490618954.png
由于logistic function非常陡峭,如果将\(f(x_i)\)通过这个函数映射之后,结果要么接近于0,要么接近于1,所以可以据此进行二分类。

Logistic Regression

\(z = wx+b\)\(y = logistic(z)\)复合之后得到\[y = \frac{1}{1+e^{-(wx+b)}}\]变换一下得到\[ln\frac{y}{1-y} = wx+b\]

现在想求参数w和b。不妨记\(\beta = (w;b), \hat{x} = (x;1)\),那么\[ln\frac{y}{1-y} = \beta^T\hat{x}\]
如果将y看作x分成正例的可能性,1-y看作x被分到负例的概率,则\(\frac{y}{1-y}\)反映了x作为正例的相对可能性,称为“几率”。
\[p(y = 1|x) = \frac{e^{\beta^T\hat{x}}}{1+e^{\beta^T\hat{x}}}\]\[p(y = 0|x) = \frac{1}{1+e^{\beta^T \hat{x}}}\]
接下来就可以利用“极大似然法”估计\(\beta\)了。
\[p(y_i|x_i;w,b) = y_ip_1(\hat{x_i};\beta) + (1-y_i)p_0(\hat{x_i};\beta)\]
对数似然函数为\[l(w,b) = \sum_{i = 1}^plnp(y_i|x_i;w,b)\]
接下来\[\begin{eqnarray}l(w,b) &=& \sum_{i = 1}^pln[y_ip_1(\hat{x};\beta)+(1-y_i)p_0(\hat{x};\beta)]\\ &=& \sum_{i = 1}^pln[y_i\frac{e^{\beta^T\hat{x_i}}}{1+e^{\beta^T\hat{x_i}}}+(1-y_i)\frac{1}{1+e^{\beta^T \hat{x_i}}}]\\ &=& \sum_{i = 1}^pln[\frac{e^{y_i\beta^T\hat{x_i}}}{1+e^{\beta\hat{x_i}}}]\\ &=& \sum_{i = 1}^p[y_i\beta^T\hat{x_i}-ln(1+e^{\beta\hat{x_i}})] \end{eqnarray}\]
最大化\(l(w,b)\) = 最小化\(-l(w,b)\),这就将求参数问题转换成了最优化问题求解。求解这种无约束的凸优化问题可以利用梯度下降法、牛顿法等。
\[\beta^* = {arg min}_{\beta}(-y_i\beta^T\hat{x_i}+ln(1+e^{\beta^T\hat{x_i}}))\]

Logistic Regression的优点

(1)直接对分类可能性进行建模,无需实现假设数据分布,避免因假设分布不准确所带来的问题。

(2)还可以得到近似概率预测,这对许多利用概率辅助决策的任务很重要。
(3)对率函数是任意阶可导的凸函数,有很好的数学性质,现有的许多数值优化算法都可以直接用于求取最优解。

例子

使用西瓜书中的西瓜数据,实现一下西瓜书中的Logistic Regression练手,但是感觉代码不优雅。。。贴出来接受指正。

`
python
import numpy as np

#使用的数据集X = np.array([[0.697,0.460],              [0.774,0.376],              [0.634,0.264],              [0.608,0.318],              [0.556,0.215],              [0.403,0.237],              [0.481,0.149],              [0.437,0.221],              [0.666,0.091],              [0.243,0.267],              [0.245,0.057],              [0.343,0.099],              [0.639,0.161],              [0.657,0.198],              [0.360,0.370],              [0.593,0.042],              [0.719,0.103]])Y = np.array([1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0])class LogisticRegression():    def __init__(self,max_iter= 300,B = None,lable = None):        self.max_iter = max_iter        self.B = B        self.lable = lable            def fit(self,X,Y):        """        X:训练数据,矩阵维度为:n_samples * n_features        Y:n_samples个类标        """        self.lable = list(set(Y))        changeY = np.zeros(Y.shape[0])        k = 0        for i in self.lable:            changeY[np.where(Y == i)[0]] = k            k+=1        Y = changeY        X = np.asarray(X)        Y = np.asarray(Y)        if X.shape[1] == 0:            print('没有特征变量')            return 0        elif X.shape[0] == 0:            print('没有样本')        elif X.shape[0] != Y.shape[0]:            print('X与Y的样本量不同')        else:            B = self.main_function(X,Y)        self.B = B        return self                def main_function(self,X,Y):        """        利用梯度下降求目标函数最优解        """        #合并偏执项b        X = np.concatenate((X,np.ones((X.shape[0],1))),axis = 1)        #初始化系数变量        B = np.random.uniform(0,1,X.shape[1])        #优化的目标函数        X = np.mat(X)        BX = B*X.T        L = -(BX*np.mat(Y).T)+ np.sum(np.log(1+np.exp(BX)))        L_sub = L        aiter = 0        while  aiter < self.max_iter:            aiter += 1            #判断            if L_sub<1e-10:                break            #更新B            B1 = np.zeros((X.shape[1]))            B2 = np.mat(np.zeros((X.shape[1],X.shape[1])))            for i in range(X.shape[0]):                p1 = np.exp(B*X[i,:].T)/(1+np.exp(B*X[i,:].T))                p1 = p1[0,0]                                B1 = B1+(-X[i,:]*(Y[i]-p1))                B2 = B2+(X[i,:].T*X[i,:]*p1*(1-p1))                            #更新B            B = B - B1*np.linalg.inv(B2)            BX = B*X.T            L_1 = L            L = -(BX*np.mat(Y).T)+ np.sum(np.log(1+np.exp(BX)))            L_sub = L_1-L        return B            def predict(self,X):        """        给定数据集进行预测        """          X = np.concatenate((X,np.ones((X.shape[0],1))),axis = 1)        X = np.mat(X)        labels = np.zeros((X.shape[0]))        ratio = []        for i in range(X.shape[0]):            BX = self.B*X[i].T            p1 = BX/(1+BX)            p2 = 1/(1+BX)            ratio.append(np.array(p1/p2).flatten()[0])            #分类阈值            if p1 > p2:                labels[i] = int(self.lable[1])            else:                labels[i] = int(self.lable[0])        return np.array(labels),ratio            def ROC(ratio,Y):        import operator        """根据样本作为正例的可能性p1.和作为负例的可能性p0的比值ratio,        计算假正例率和正正例率"""        ratio_dict = {}        k = 0        for i in ratio:            ratio_dict[k] = i            k += 1                    #排序        ratio_s = sorted(ratio_dict.items(),key = operator.itemgetter(1),                         reverse = True)        Y_s = []        for i in ratio_s:            Y_s.append(Y[i[0]])                 num_P = Y_s.count(1) #实际正例数 = (TP+FN)        num_F = Y_s.count(0) #实际反例数 = (TN+FP)        x_FPR = []  #假正例率 FPR = FP/(TN+FP)        y_TPR = []  #真正例率 TPR = TP/(TP+FN)        #计算真正例和假正例个数        count_FP = 0        count_TP = 0                for i in range(len(ratio)):            if Y_s[i] == 1: #真正例                count_TP += 1            else:           #假正例                count_FP += 1            x_FPR.append(count_FP/num_F)            y_TPR.append(count_TP/num_P)                    #AUC        area = 0        for i in range(len(ratio)-1):            area += (x_FPR[i+1] - x_FPR[i])*(y_TPR[i+1]+y_TPR[i])        AUC = area/2        #画图        import matplotlib.pyplot as plt        plt.figure()        plt.plot(x_FPR,y_TPR)        plt.title('ROC(AUC = %.2f)'%AUC)        plt.xlabel('FPR')        plt.ylabel('TPR')        `

1228226-20170902231803718-1980473475.png

转载于:https://www.cnblogs.com/miranda-zxh/p/7465914.html

你可能感兴趣的文章