凸函数及其性质
1. 定义
1.1 定义一
- 如果对任意\(x_1\)、\(x_2\)总有\(f[\alpha x_1 + (1 – \alpha )x_2] \ge \alpha f(x_1) + (1 – \alpha )f(x_2)\),其中\(\displaystyle 0 \le \alpha \le 1\),则称\(f(x)\)为上凸函数
- 如果对任意\(x_1\)、\(x_2\)且\(x_1 \ne x_2\),总有\(f[\alpha x_1 + (1 – \alpha )x_2] \gt \alpha f(x_1) + (1 – \alpha )f(x_2)\),其中\(0 \lt\alpha \lt 1\),则称\(f(x)\)为严格上凸函数
1.2 定义二
- 如果对任意\(x_1\)、\(x_2\)总有\(f[\alpha x_1 + (1 – \alpha )x_2]\le \alpha f(x_1) + (1 – \alpha )f(x_2)\),其中\(\displaystyle 0 \le \alpha \le 1\),则称\(f(x)\)为下凸函数
- 如果对任意\(x_1\)、\(x_2\)且\(x_1 \ne x_2\),总有\(f[\alpha x_1 + (1 – \alpha )x_2] \lt \alpha f(x_1) + (1 – \alpha )f(x_2)\),其中\(0 \lt\alpha \lt 1\),则称\(f(x)\)为严格下凸函数
2. 琴生(Jenson)不等式
- 对于上凸函数,\(f(E[X]) \ge E[f(x)]\)或\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k)\),其中\(\lambda_1,\lambda_2,\cdots,\lambda_q\)为正实数(或非负实数,后者去除无影响的\(\lambda_i = 0\)的项即为前者,故二者等价)且\(\displaystyle \sum_{k=1}^q \lambda_k = 1\);对于严格上凸函数,上述等号成立当且仅当\(x_1 = x_2 = \cdots = x_q\)。
- 对于下凸函数,\(f(E[X])\le E[f(x)]\)或\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k) \ge f(\sum_{k=1}^q \lambda_k x_k)\),其中\(\lambda_1,\lambda_2,\cdots,\lambda_q\)为正实数(或非负实数,后者去除无影响的\(\lambda_i = 0\)的项即为前者,故二者等价)且\(\displaystyle \sum_{k=1}^q \lambda_k = 1\);对于严格下凸函数,上述等号成立当且仅当\(x_1 = x_2 = \cdots = x_q\)。
\(\downarrow\)证明过程如下\(\downarrow\)
2.1 上凸函数
证明:因为\(\lambda_i\)均为正实数,故有
\(\displaystyle f( \sum_{k=1}^q \lambda_k x_k) = f(\lambda_1 x_1 + \sum_{k=2}^q \lambda_k {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}) \ge \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k})\)
\(\displaystyle = \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\lambda_2 \over \sum_{k=2}^q \lambda_k} x_2 + {\sum_{k=3}^q \lambda_k \over \sum_{k=2}^q \lambda_k} \cdot {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})\)
\(\displaystyle \ge \lambda_1 f(x_1) + \lambda_2 f(x_2) + \sum_{k=3}^q \lambda_k \cdot f({\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})\)
\(\displaystyle \ge \cdots \ge \sum_{k=1}^q \lambda_k f(x_k)\)
2.2 严格上凸函数
证明:由定义可知,对于严格上凸函数,\(f[\alpha x_1 + (1 – \alpha )x_2] \ge \alpha f(x_1) + (1 – \alpha )f(x_2)\)等号成立时当且仅当\(x_1 = x_2\) 。而根据上文对于上凸函数对于\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k)\)不等式推导过程可知,若上凸函数为严格上凸函数,则第一个\(\ge\)处等号成立当且仅当:\(x_1 = {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}\);第二个\(\ge\)处等号成立当且仅当:\(x_2 = {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k}\);\(\cdots\);第\(q-1\)个\(\ge\)处等号成立当且仅当:\(x_{q-1} = {\sum_{k=q}^q \lambda_k x_k \over \sum_{k=q}^q \lambda_q} = x_q\)。所有等号都成立则以上条件都需满足,对以上条件反向推导可得:\(x_q = x_{q-1}\);\(x_{q-2} = {\sum_{k=q-1}^q \lambda_k x_k \over \sum_{k=q-1}^q \lambda_k} = {\lambda_{q-1} x_{q-1} + \lambda_{q} x_q \over \lambda_{q-1} + \lambda_{q}} = x_{q-1}\);\(\cdots\);\(x_1 = x_2\)。
即\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k)\)等号成立当且仅当\(x_1 = x_2 = \cdots = x_q\)
2.3 下凸函数
证明:因为\(\lambda_i\)均为正实数,故有
\(\displaystyle f( \sum_{k=1}^q \lambda_k x_k) = f(\lambda_1 x_1 + \sum_{k=2}^q \lambda_k {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}) \le \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k})\)
\(\displaystyle = \lambda_1 f(x_1) + \sum_{k=2}^q \lambda_k \cdot f({\lambda_2 \over \sum_{k=2}^q \lambda_k} x_2 + {\sum_{k=3}^q \lambda_k \over \sum_{k=2}^q \lambda_k} \cdot {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})\)
\(\displaystyle \le \lambda_1 f(x_1) + \lambda_2 f(x_2) + \sum_{k=3}^q \lambda_k \cdot f({\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k})\)
\(\displaystyle \le \cdots \le \sum_{k=1}^q \lambda_k f(x_k)\)
2.4 严格下凸函数
证明:由定义可知,对于严格下凸函数,\(f[\alpha x_1 + (1 – \alpha )x_2] \le \alpha f(x_1) + (1 – \alpha )f(x_2)\)等号成立时当且仅当\(x_1 = x_2\) 。而根据上文对于下凸函数对于\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k)\le f(\sum_{k=1}^q \lambda_k x_k)\)不等式推导过程可知,若下凸函数为严格下凸函数,则第一个\(\le\)处等号成立当且仅当:\(x_1 = {\sum_{k=2}^q \lambda_k x_k \over \sum_{k=2}^q \lambda_k}\);第二个\(\le\)处等号成立当且仅当:\(x_2 = {\sum_{k=3}^q \lambda_k x_k \over \sum_{k=3}^q \lambda_k}\);\(\cdots\);第\(q-1\)个\(\le\)处等号成立当且仅当:\(x_{q-1} = {\sum_{k=q}^q \lambda_k x_k \over \sum_{k=q}^q \lambda_q} = x_q\)。所有等号都成立则以上条件都需满足,对以上条件反向推导可得:\(x_q = x_{q-1}\);\(x_{q-2} = {\sum_{k=q-1}^q \lambda_k x_k \over \sum_{k=q-1}^q \lambda_k} = {\lambda_{q-1} x_{q-1} + \lambda_{q} x_q \over \lambda_{q-1} + \lambda_{q}} = x_{q-1}\);\(\cdots\);\(x_1 = x_2\)。
即\(\displaystyle \sum_{k=1}^q \lambda_k f(x_k) \ge f(\sum_{k=1}^q \lambda_k x_k)\)等号成立当且仅当\(x_1 = x_2 = \cdots = x_q\)