数学物理学报, 2025, 45(5): 1698-1710

一个基于 BFGS 更新的惯性三项共轭梯度投影法

刘鹏杰,, 邵虎,*, 李佳宜,, 董端端,

中国矿业大学数学学院, 江苏省应用数学中心 江苏徐州 221116

An Inertial Three-Term Conjugate Gradient Projection Method Based on BFGS Update

Liu Pengjie,, Shao Hu,*, Li Jiayi,, Dong Duanduan,

School of Mathematics, Jiangsu Center for Applied Mathematics, China University of Mining and Technology, Jiangsu Xuzhou 221116

通讯作者: * 邵虎, E-mail:shaohu@cumt.edu.cn

收稿日期: 2024-04-12   修回日期: 2024-12-20  

基金资助: 中央高校基本科研业务费(2025QN1147)

Received: 2024-04-12   Revised: 2024-12-20  

Fund supported: Fundamental Research Funds for the Central Universities(2025QN1147)

作者简介 About authors

刘鹏杰,E-mail:liupengjie2019@163.com;

李佳宜,E-mail:10213705@cumt.edu.cn;

董端端,E-mail:03201428@cumt.edu.cn

摘要

该文首先基于 Broyden-Fletcher-Goldfarb-Shanno 更新, 给出一类改进的三项搜索方向. 结合惯性策略和超平面投影技术, 提出一个惯性的三项共轭梯度投影法用于求解无约束非线性单调方程组. 所提惯性算法具有如下特征: (i) 其搜索方向独立于任何线搜索, 有充分下降性和信赖域性质; (ii) 惯性技术采用三步迭代信息以产生惯性迭代点; (iii) 底层映射无需满足 Lipschitz 连续性, 亦可获得算法的全局收敛性结果. 通过初步数值试验结果验证了算法的有效性.

关键词: 非线性单调方程组; 三项共轭梯度法; 惯性策略; BFGS 更新; 全局收敛性

Abstract

Based on the BFGS update, in this paper, we present a class of improved three-term search directions. Then, by combining the inertial strategy and hyperplane projection approach, an inertial three-term conjugate gradient projection method is proposed for solving unconstrained nonlinear monotone equations. The proposed inertial algorithm exhibits the following features: (i) its search direction always possesses sufficient descent and trust region properties, independent of any line search; (ii) the inertial strategy generates iterative points by utilizing information from three iterations; (iii) the method achieves global convergence results without assuming the underlying mapping to satisfy Lipschitz continuity. Preliminary numerical experiments verify the effectiveness of the proposed method.

Keywords: nonlinear monotone equations; three-term conjugate gradient method; inertial strategy; BFGS Update; global convergence

PDF (968KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

刘鹏杰, 邵虎, 李佳宜, 董端端. 一个基于 BFGS 更新的惯性三项共轭梯度投影法[J]. 数学物理学报, 2025, 45(5): 1698-1710

Liu Pengjie, Shao Hu, Li Jiayi, Dong Duanduan. An Inertial Three-Term Conjugate Gradient Projection Method Based on BFGS Update[J]. Acta Mathematica Scientia, 2025, 45(5): 1698-1710

1 引言

该文研究如下无约束非线性方程组的迭代求解方法

$\begin{equation}\label{eq1} F(x)=0, \end{equation}$

其中底层映射 $F: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}$ 连续且单调.

所谓单调性是指映射 $F$ 满足

$\begin{matrix}\label{1.2} \left(F(x)-F(y)\right)^{\top}(x-y) \geq 0,~\forall~x,~y\in \mathbb{R}^{n}. \end{matrix}$

非线性方程组 (1.1) 在工程计算中具有广泛应用, 如压缩感知, 经济均衡系统, 化学平衡系统等. 由于其自身非线性特性, 获得解析解将具有一定挑战性. 求解 (1.1) 的数值方法主要有两类, 一是基于导数方法, 如牛顿法, 拟牛顿法及其改进版本等, 二是无导数方法, 如谱梯度投影法, 共轭梯度投影法等.

为使算法求解结构简单和内存需求低, Solodov 和 Svaiter[26] 提出仅使用函数值信息的无导数投影法, 每次迭代无需存储矩阵. 在此基础上, Zhang 和 Zhou[34] 结合谱梯度法建立求解无约束非线性单调方程组的谱梯度型投影法. Cheng[3] 将经典 Polak-Ribi\`{e}re-Polyak 共轭梯度法拓展至求解无约束非线性单调方程组, 在底层映射满足单调性和 Lipschitz 连续性时, 证明其全局收敛性. 刘金魁等 [15] 基于 Hestenes-Stiefel (HS) 共轭参数结构, 提出一种无导数投影法用于求解大规模凸约束伪单调方程组. 无需 Lipschitz 连续性, 得到算法的全局收敛性和 R-线性收敛率. 张宁和刘金魁 [35]使用 Liu-Storey (LS) 共轭参数, 借助充分下降条件获得谱参数, 进而得到一种谱 LS 型无导数投影法. 在 Lipschitz 连续假设下, 建立算法的全局收敛性. Yin 等 [31]提出一个混合三项共轭梯度投影法, 其搜索方向类似无记忆 Broyden-Fletcher-Goldfarb-Shanno (BFGS) 结构, 无需任何线搜索具有充分下降性和信赖域性质. 基于 BFGS 结构的更多无导数投影法见文献 [17,18,29] 及其参考文献.

惯性作为一种有效加速策略, 其源于 Polyak[22] 提出的重球法. 传统迭代算法中, 每一步仅考虑当前迭代点信息产生新的迭代点, 而惯性加速策略则使用至少两步迭代点信息. 最近一些学者将惯性策略应用于无导数投影法以解决非线性方程组问题. Ibrahim 等[9]将惯性外推法扩展到 Fletcher-Reeves 型无导数投影法中, 用于求解单调且 Lipschitz 连续的非线性方程组. Jian 等[11]基于惯性加速技术提出一类惯性无导数投影法以求解凸约束非线性伪单调方程组, 无需 Lipschitz 连续性, 建立算法的全局收敛性. Zhang 等 [36] 通过将惯性外推步嵌入到搜索方向中, 提出一种改进的混合谱共轭梯度投影法, 其独立于任何线搜索, 即可获得搜索方向的充分下降性. 有关解决非线性方程组的更多惯性方法, 见文献 [10,16,27,32] 等.

基于上述研究, 该文首先拓展 BFGS 搜索方向提出一个改进的三项搜索方向框架. 同时给出一种新的惯性步策略, 即使用三步迭代信息产生惯性迭代点, 这与现有多数求解非线性方程组惯性算法 (根据两步迭代信息) 不同. 进而结合投影技术, 提出一种有效的惯性三项共轭梯度投影法用于求解无约束非线性单调方程组 (1.1). 所提惯性算法具有如下理论特征: (i) 独立于任何线搜索, 其搜索方向具有充分下降性和信赖域性质; (ii) 无需 Lipschitz 连续性假设, 建立所提算法的全局收敛性.

该文余下结构安排为: 第二部分基于 BFGS 更新, 给出一类新的三项搜索方向, 进而结合惯性策略和超平面投影技术提出惯性三项共轭梯度投影法; 第三部分在较弱假设下, 论证所提算法的全局收敛性; 第四部分对新算法进行数值测试和报告, 并与其它同类方法进行比较.

2 惯性三项共轭梯度投影法

为发挥拟牛顿法的性能优势, 研究人员利用其潜在理论特点构造近似于拟牛顿方向的算法求解无约束优化问题 $\min\limits_{x\in \mathbb{R}^n} f(x)$, 如文献[12,14,28]. 记 $F_k := F(x_k)= \nabla f(x_k)$. 对于 Nocedal 和 Shanno 给出的无记忆 BFGS 方法 [21,25], 其搜索方向为

$\begin{eqnarray*} d_{k}^{\rm LBFGS}&=&-F_{k}+\left(\frac{F_k^{\top}y_{k-1}}{d_{k-1}^{\top}y_{k-1}}-\frac{\|y_{k-1}\|^2F_k^{\top}d_{k-1}}{(d_{k-1}^{\top}y_{k-1})^2}\right)d_{k-1}+\frac{F_k^{\top}d_{k-1}}{d_{k-1}^{\top}y_{k-1}}(y_{k-1}-s_{k-1}) \nonumber \\ &=&-F_{k}+\left(\beta_{k}^{\rm HS}-\frac{\|y_{k-1}\|^2F_k^{\top}d_{k-1}}{(d_{k-1}^{\top}y_{k-1})^2}\right)d_{k-1}+\frac{F_k^{\top}d_{k-1}}{d_{k-1}^{\top}y_{k-1}}(y_{k-1}-s_{k-1}), \end{eqnarray*}$

其中 $y_{k-1}=F_{k} - F_{k-1}$, $s_{k-1}=x_{k}-x_{k-1}$. 为发展 BFGS 方法, Li[14] 基于 HS 方法, 提出如下近似 BFGS 拟牛顿方法的三项搜索方向

$\begin{eqnarray*} d_{k}^{\rm THS}=-F_{k}+\left(\beta_{k}^{\rm HS}-\frac{\|y_{k-1}\|^2F_k^{\top}d_{k-1}}{(d_{k-1}^{\top}y_{k-1})^2}\right)d_{k-1}+\chi_k\frac{F_k^{\top}d_{k-1}}{d_{k-1}^{\top}y_{k-1}}y_{k-1}, \end{eqnarray*}$

其中 $0\leq \chi_k \leq \bar{\chi} <1$, 数值试验中取

$\begin{eqnarray*} \chi_k = {\rm min} \left\{\bar{\chi}, {\rm max}\left\{0, 1-\frac{y_{k-1}^{\top}s_{k-1}}{\|y_{k-1}\|^2}\right\} \right\}. \end{eqnarray*}$

上式中 $1-\frac{y_{k-1}^{\top}s_{k-1}}{\|y_{k-1}\|^2}$ 是极小化问题 $\min\limits_{\chi \in \mathbb{R}} \|(y_{k-1}-s_{k-1})-\chi y_{k-1}\|^2$ 的解. 另一方面, Narushima 等[20]给出三项搜索方向的一般定义, 即

$d_{k}^{\mathrm{N}}=\left\{\begin{array}{ll} -F_{k}, & \text { 若 } k=0 \text { 或 } F_{k}^{\top} p_{k}=0, \\ -F_{k}+\beta_{k} d_{k-1}-\beta_{k} \frac{F_{k}^{\top} d_{k-1}}{F_{k}^{\top} p_{k}} p_{k}, & \text { 否则, } \end{array}\right.$

其中 $p_{k} \in \mathbb{R}^{n} $ 是任意向量. 独立于任何线搜索, 以及 $\beta_k$$p_k$ 的选择, $d^{\rm N}_{k}$ 总满足充分下降条件. 考虑到 $\beta_k$$p_k$ 的灵活选择, 这为改进数值性能提供更多可能.

借鉴文献 [14] 中 $d_{k}^{\rm THS}$ 的迭代形式及文献 [20] 中 $d^{\rm N}_{k}$ 的第三项参数选取, 考虑采用任意非零向量 $p_k$ 代替 $d_{k}^{\rm THS}$ 中部分 $y_{k-1}$, 从而给出一个改进的三项搜索方向框架, 即

$ d_k^{\rm TCG} = -F_{k}+\left(\frac{F_k^{\top}p_{k}}{d_{k-1}^{\top}y_{k-1}}-\frac{\|p_{k}\|^2F_k^{\top}d_{k-1}}{(d_{k-1}^{\top}y_{k-1})^2}\right) {d_{k - 1}}+ \chi_k\frac{F_k^{\top}d_{k-1}}{d_{k-1}^{\top}y_{k-1}} p_{k}, $

其中 $0\leq \chi_k \leq \bar{\chi} <1$. 搜索方向 $d_k^{\rm TCG}$$p_{k}$ 具有很大的选取灵活性. 易见, 当 $p_{k}$$y_{k-1}$ 时, $d_k^{\rm TCG}$ 退化为 $d_{k}^{\rm THS}$; 当 $p_{k}$$F_{k}$ 时, $d_k^{\rm TCG}$ 可视为 Dai-Yuan (DY) 方法[6]的一种三项推广. 为使 $d_k^{\rm TCG}$ 某种程度上逼近于 BFGS 方法, 参考文献 [14] 令

$\begin{matrix}\label{A2.2} \chi_k = {\rm min} \left\{ \bar{\chi}, {\rm max}\left\{0, \ \frac{p_{k}^{\top}(y_{k-1}-s_{k-1})}{\|p_{k}\|^2}\right\} \right\}, \end{matrix}$

其中 $\frac{p_{k}^{\top}(y_{k-1}-s_{k-1})}{\|p_{k}\|^2}$ 是极小化问题 $\min\limits_{\chi \in \mathbb{R}} \|(y_{k-1}-s_{k-1})- \chi p_{k}\|^2$ 的解. 虽然 $d_k^{\rm TCG}$ 具有较大灵活选择性, 但对于下文所设计的梯度投影算法理论分析 (如充分下降性和收敛性分析) 仍存在一定局限. 为此, 对其稍作调整得

$\begin{matrix}\label{AQ2.1} d_k = -F_{k}+{\beta^{\rm ITCG}_k}{d_{k - 1}}+\theta_{k}^{\rm ITCG} p_{k}, \ k\geq 1, \ d_0 = -F_0, \end{matrix}$

其中

$\begin{matrix}\label{AQ2.2} \beta^{\rm ITCG}_k = \frac{F_{k}^{\top} p_{k}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}F_{k}^{\top}d_{k-1}}{{\varpi_{k}}^{2}}, \ \ \theta_{k}^{\rm ITCG} = {\chi_{k}}\frac{{F^{\top}_{k}}d_{k-1}}{{\varpi_k}} \ (0 \leq \chi_{k} \leq \bar{\chi} < 1), \end{matrix} $2.3)

$\begin{matrix}\label{AQ2.2b} \ {\varpi_{k}} = \max\{\tau({\|d_{k-1}\|}^{2}+\|p_{k}\|^{2}), \ {d^{\top}_{k-1}}{y_{k-1}}\} \ (\tau > 0). \end{matrix}$

数值试验中 $\chi_{k}$ 仍选取 (2.1) 式.

基于搜索方向 (2.2)-(2.4) 式, 结合惯性策略和超平面投影技术, 给出惯性三项共轭梯度投影法 (算法 iITCGP) 求解问题 (1.1). 下面给出具体迭代步骤.

${\bf\large 算法 iITCGP}$

${\bf 步骤 0}$ 给定初始迭代点 $x_{-2}, \ x_{-1}, \ x_{0} \in \mathbb{R}^n$. 参数 $\epsilon \geq 0,$$\varepsilon>0$, $\phi \geq 0$, $\psi \geq 0$, $\tau > 0$, $0 \leq \chi_{k} \leq \bar{\chi} < 1$, $\varsigma > 0$, $\sigma > 0$, $\rho \in (0,1),$$0<\mu_1 \leq \mu_2,$$0 < \underline{\gamma} \leq \gamma_k \leq \bar{\gamma} < 2$. 选取任意非零向量 $p_{k}$. 惯性步控制序列 $\{\epsilon_k\}$ ($\epsilon_k \in [0, \epsilon)$)$\{\hat{\epsilon}_k\}$ ($\hat{\epsilon}_k \in [0, \epsilon)$) 分别满足 $\sum\limits_{k=0}^{+\infty} \epsilon_k < +\infty$$\sum\limits_{k=0}^{+\infty} \hat{\epsilon}_k < +\infty.$$k:=0$.

${\bf 步骤 1}$ 计算 $F(x_k)$.$\|F(x_k)\|\leq \varepsilon$, 则停止.

${\bf 步骤 2}$ 计算惯性步长

$\begin{equation}\label{C1} \hspace{-.7cm}\phi_{k}=\begin{cases} \min\left\{\phi, \ \frac{\epsilon_k}{\|x_k-x_{k-1}\|}\right\},\:& \text{若} \ x_k\neq x_{k-1}, \\ \phi,\: & \text{否则}, \end{cases} \end{equation}$
$\begin{equation}\label{C1a} \psi_{k}=\begin{cases} \min\left\{\psi, \ \frac{\hat{\epsilon}_k}{\|x_{k-1} -x_{k-2}\|}\right\},\:& \text{若} \ x_{k-1} \neq x_{k-2}, \\ \psi,\: & \text{否则}, \end{cases} \end{equation}$

产生惯性迭代点为 $v_k=x_k+\phi_k(x_k-x_{k-1}) + \psi_k(x_{k-1} -x_{k-2})$, 再计算 $F(v_k)$.$\|F(v_k)\|\leq\varepsilon$, 则停止.

${\bf 步骤 3}$ 计算搜索方向 $d_{k}$,

$d_{k}=\left\{\begin{array}{ll} -F\left(v_{k}\right), & \text { 若 } k=0, \\ -F\left(v_{k}\right)+\beta_{k}^{\mathrm{ITCG}} d_{k-1}+\theta_{k}^{\mathrm{ITCG}} p_{k}, & \text { 若 } k \geq 1, \end{array}\right.$

其中

$\begin{matrix}\label{d2} \beta^{\rm ITCG}_k = \frac{F(v_k)^{\top} p_{k}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}F(v_k)^{\top}d_{k-1}}{{\varpi_{k}}^{2}}, \ \ \theta_{k}^{\rm ITCG} = {\chi_{k}}\frac{F(v_k)^{\top} d_{k-1}}{{\varpi_k}}, \end{matrix}$
$\begin{matrix}\label{d2b} \ {\varpi_{k}} = \max\{\tau({\|d_{k-1}\|}^{2}+\|p_{k}\|^{2}), \ {d^{\top}_{k-1}}{\bar{y}_{k-1}}\}, \ \bar{y}_{k-1} = F(v_{k})-F(v_{k-1}). \end{matrix}$

$d_{k} = 0$, 则停止.

${\bf 步骤 4}$$z_k=v_k+t_k d_k$, 步长 $t_{k}= \varsigma {\rho}^{i_{k}}$$i_{k}$ 是满足下式的最小非负整数 $i$:

$\begin{matrix}\label{eq10} -F(v_k+t_k d_k)^{\top}d_{k} \geq \sigma t_{k} \mathcal{P}_{[\mu_1,\mu_2]}[\|F(v_k+t_k d_k)\|] \|d_{k}\|^{2}. \end{matrix}$

$\|F(z_{k})\|\leq \varepsilon$, 则停止.

${\bf 步骤 5}$ 计算 $x_{k+1} = v_{k}- \gamma_k \xi_k F(z_k),$ 其中

$\begin{eqnarray*}\label{eq12} \xi_k=\frac{F(z_{k})^{\top}(v_{k}-z_{k})}{\|F(z_{k})\|^{2}}. \end{eqnarray*}$

$k:=k+1$, 转步骤 1.

${\bf注 2.1}$ 记投影算子 $\mathcal{P}_{\mathcal{D}}[x]=\arg\min\{\|y-x\|~|~y\|F(z_{k})\|\in \mathcal{D}\}$. 对于自适应线搜索 (2.10) 式 [30,31], 使用投影技术 $\mu_1 \leq \mathcal{P}_{[\mu_1,\mu_2]}[\cdot] \leq \mu_2$ 可避免 (2.10) 式右侧值过小或过大, 以此有效降低线搜索计算成本. 文献 [引理 3] 给出线搜索 (2.10) 式的适定性.

${\bf注 2.2}$ 不同于现有大部分求解 (1.1) 的无导数投影算法研究, 该文使用三步迭代信息产生惯性点. 由算法 iITCGP 步骤 2 知, 对任意 $k$, 有 $\phi_k \|x_{k}-x_{k-1}\| \leq \epsilon_k$, $\psi_k \|x_{k-1}-x_{k-2}\| \leq \hat{\epsilon}_k$, 且

$\begin{eqnarray*}\label{2.11} \sum\limits_{k=0}^{+\infty} \phi_k \|x_{k}-x_{k-1}\| \leq \sum\limits_{k=0}^{+\infty} \epsilon_k < +\infty, \ \ \sum\limits_{k=0}^{+\infty} \psi_k \|x_{k-1}-x_{k-2}\| \leq \sum\limits_{k=0}^{+\infty} \hat{\epsilon}_k < +\infty, \end{eqnarray*}$

进而

$\begin{eqnarray*}\label{2.12} \lim\limits_{k\rightarrow +\infty} \phi_k \|x_{k}-x_{k-1}\| =0, \ \ \lim\limits_{k\rightarrow +\infty} \psi_k \|x_{k-1}-x_{k-2}\| =0. \end{eqnarray*}$

下面引理 2.1 表明独立于任意线搜索, 搜索方向 (2.7)-(2.9) 式具有充分下降性和信赖域性质.

${\bf引理2.1}$ 对任意 $k \geq 0$, 搜索方向 (2.7)-(2.9) 式满足充分下降性

$\begin{matrix}\label{3.2} \ F (v_{k})^{\top}d_{k} \leq -\left(1-\frac{(1+\bar{\chi})^{2}}{4}\right){\|F(v_{k})\|^{2}} \end{matrix}$

和信赖域性质

$\left(1-\frac{(1+\bar{\chi})^{2}}{4}\right)\left\|F\left(v_{k}\right)\right\| \leq\left\|d_{k}\right\| \leq\left(1+\frac{1+\bar{\chi}}{2 \tau}+\frac{1}{4 \tau^{2}}\right)\left\|F\left(v_{k}\right)\right\|.$

${\bf证}$$k=0$ 时, $d_{0} = -F(v_{0})$, 有 $F(v_{0})^{\top}d_{0} = -{\|F(v_{0})\|^{2}}$ 满足 (2.11) 式. 根据 $d_k$ 的定义, 当 $k \geq 1$ 时, 得

$\begin{eqnarray*} F(v_{k})^{\top}d_{k}&=& -\|F(v_{k})\|^{2}+\beta_{k}^{\rm ITCG}F(v_{k})^{\top}d_{k-1}+ \theta_{k}^{\rm ITCG} F(v_{k})^{\top}p_{k} \\ &=& -\|F(v_{k})\|^{2}+\left(\frac{F(v_{k})^{\top}p_{k}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}F(v_{k})^{\top}d_{k-1}}{{\varpi_{k}}^{2}}\right)F(v_{k})^{\top}d_{k-1}+{\chi_{k}}\frac{F(v_{k})^{\top}d_{k-1}}{{\varpi_k}}F(v_{k})^{\top}p_{k} \\ &=& -\|F(v_{k})\|^{2}+(1+\chi_{k})\frac{F(v_{k})^{\top}p_{k} \cdot F(v_{k})^{\top}d_{k-1}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}(F(v_{k})^{\top}d_{k-1})^{2}}{{\varpi_{k}}^{2}} \\ &=& -\|F(v_{k})\|^{2}+2\left(\frac{1+\chi_{k}}{2}F(v_{k})^{\top}\right)\frac{p_{k} \cdot F(v_{k})^{\top}d_{k-1}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}(F(v_{k})^{\top}d_{k-1})^{2}}{{\varpi_{k}}^{2}} \\ &\leq& -\|F(v_{k})\|^{2}+\frac{(1+\chi_{k})^{2}}{4}\|F(v_{k})\|^{2}+\frac{\|p_{k}\|^{2}(F(v_{k})^{\top}d_{k-1})^{2}}{{\varpi_{k}}^{2}}-\frac{\|p_{k}\|^{2}(F(v_{k})^{\top}d_{k-1})^{2}}{{\varpi_{k}}^{2}} \\ &=& -\|F(v_{k})\|^{2}+\frac{(1+\chi_{k})^{2}}{4}\|F(v_{k})\|^{2} \\ &\leq& -\left(1-\frac{(1+\bar{\chi})^{2}}{4}\right){\|F(v_{k})\|^{2}}, \end{eqnarray*}$

其中第一个不等式利用 $a^2 + b^2 \geq 2ab$, 从而 (2.11) 式成立.

$d_{0} = -F(v_{0})$$\|d_0\| = \|F(v_{0})\|$, 进而 $k = 0$ 时 (2.12) 式成立. 下证当 $k \geq 1$ 时 (2.12) 式成立. 由算法 iITCGP 的步骤 3 知对任意 $k$, $d_{k} \neq 0$. 利用 (2.11) 式和柯西-施瓦茨不等式, 知 (2.12) 式的第一个不等式成立. 接下来, 根据 $\varpi_{k}$ 的定义并使用不等式 $a^2 + b^2 \geq 2ab$, 知

$\begin{matrix}\label{0611A1} \ \varpi_{k} = \max\{\tau({\|d_{k-1}\|}^{2}+\|p_{k}\|^{2}), \ {d^{\top}_{k-1}}{y_{k-1}}\} \geq \tau({\|d_{k-1}\|}^{2}+\|p_{k}\|^{2}) \geq 2 \tau \|d_{k-1}\| \|p_{k}\|. \ \ \ \ \end{matrix}$

根据 $\beta^{\rm ITCG}_k$ 的定义, 得

$\begin{eqnarray*} \ |\beta_{k}^{\rm ITCG}| &=& \left|\frac{F(v_{k})^{\top} p_{k}}{\varpi_{k}}-\frac{\|p_{k}\|^{2}F(v_{k})^{\top}d_{k-1}}{{\varpi_{k}}^{2}}\right| \\ &\leq& \frac{\|F(v_{k})\|\|p_{k}\|}{\varpi_{k}}+\frac{\|p_{k}\|^{2}\|F(v_{k})\|\|d_{k-1}\|}{{\varpi_{k}}^{2}} \\ &\leq& \frac{\|F(v_{k})\|\|p_{k}\|}{2\tau\|d_{k-1}\|\|p_{k}\|}+\frac{\|p_{k}\|^{2}\|F(v_{k})\|\|d_{k-1}\|}{(2\tau \|d_{k-1}\|\|p_{k}\|)^{2}} \\ &=& \left(\frac{1}{2\tau}+\frac{1}{4{\tau}^{2}}\right)\frac{\|F(v_{k})\|}{\|d_{k-1}\|}, \end{eqnarray*}$

其中第一个不等式使用柯西-施瓦茨不等式, 第二个不等式使用 (2.13) 式. 同理, 由 $\theta_{k}^{\rm ITCG}$ 的定义知

$\begin{eqnarray*} \ |\theta_{k}^{\rm ITCG}| = {\chi_{k}} \left|\frac{{F^{\top}_{k}}d_{k-1}}{{\varpi_k}}\right| \leq \bar{\chi}\frac{\|F(v_{k})\|\|d_{k-1}\|}{{\varpi_k}} \leq \frac{\bar{\chi}\|F(v_{k})\|\|d_{k-1}\|}{2\tau\|d_{k-1}\|\|p_{k}\|} = \frac{\bar{\chi}\|F(v_{k})\|}{2\tau\|p_{k}\|}. \end{eqnarray*}$

上述两关系式结合 $d_{k}$ 的定义, 有

$\begin{eqnarray*} \ \|d_{k}\| = \|-F(v_{k})+{\beta^{\rm ITCG}_k}{d_{k - 1}}+\theta_{k}^{\rm ITCG} p_{k}\| &\leq& \|F(v_{k})\|+|\beta^{\rm ITCG}_k|\|d_{k - 1}\|+|\theta_{k}^{\rm ITCG}|\|p_{k}\| \\ &\leq& \left(1+\frac{1+\bar{\chi}}{2\tau}+\frac{1}{4{\tau}^{2}}\right) \|F(v_{k})\|. \end{eqnarray*}$

3 全局收敛性

该节分析算法 iITCGP 的全局收敛性. 首先给出如下假设.

${\bf 假设 1}$ 问题 (1.1) 的解集 $Sol_{F}$ 非空, 映射 $F$$\mathbb{R}^{n}$ 上连续且单调 ($F$ 满足 (1.2) 式).

${\bf引理3.1}$[23] 对于非负实序列 $\{p_{k}\}$$\{q_{k}\}$, 满足 $p_{k+1}\leq p_k+ q_k$.$\sum\limits_{k=0}^{+\infty} q_k < +\infty$, 则序列 $\{p_{k}\}$ 收敛.

${\bf引理3.2}$$\{x_{k}\},\ \{v_{k}\}$$\{z_{k}\}$ 由算法 iITCGP 产生, $x^{\ast}\in Sol_{F}$. 若假设 1 成立, 则序列 $\{\|x_k-x^{\ast}\|\}$ 收敛.

${\bf证}$ 根据 $z_k=v_k+t_k d_k$ 及线搜索 (2.10) 式知

$\begin{eqnarray*} F(z_{k})^{\top}(v_{k}-z_{k}) = - t_{k} F(z_k)^{\top} d_{k} \geq \sigma \mathcal{P}_{[\mu_1,\mu_2]}[\|F(z_k)\|] \cdot t_{k}^2 \|d_{k}\|^{2} \geq \sigma \mu_1 \cdot \|v_{k}-z_{k}\|^{2} > 0. \end{eqnarray*}$

此结合假设 1 及 $x^{\ast}\in Sol_{F}$, 有

$\begin{matrix}\label{3.12} F(z_{k})^{\top}(v_{k}-x^{\ast}) &=& F(z_{k})^{\top}(v_{k}-z_{k})+F(z_{k})^{\top}(z_{k}-x^{\ast}) \nonumber \\ &=& F(z_{k})^{\top}(v_{k}-z_{k})+ [F(z_{k})- F(x^{\ast})]^{\top}(z_{k}-x^{\ast}) \nonumber \\ &\geq& F(z_{k})^{\top}(v_{k}-z_{k}) \nonumber \\ &\geq& \sigma \mu_1 \|v_{k}-z_{k}\|^{2}>0, \end{matrix}$

其中第一不等式由 $F$ 的单调性所得. 进一步利用算法步骤 5, 得

$\begin{matrix}\label{3.13} \nonumber \|x_{k+1}-x^{\ast}\|^{2}&=& \|v_{k}-\gamma_{k} \xi_{k}F(z_{k})- x^{\ast}\|^{2} \nonumber \\ &=& \|v_{k}-x^{\ast}\|^{2}-2\gamma_{k} \xi_{k}F(z_{k})^{\top}(v_{k}-x^{\ast})+\gamma_{k}^{2}\xi_{k}^{2}\|F(z_{k})\|^{2} \nonumber \\ &\leq& \|v_{k}-x^{\ast}\|^{2}-2\gamma_{k} \xi_{k}F(z_{k})^{\top }(v_{k}-z_{k})+\gamma_{k}^{2}\xi_{k}^{2}\|F(z_{k})\|^{2} \nonumber \\ &=& \|v_{k}-x^{\ast}\|^{2}-\gamma_{k}(2-\gamma_{k})\frac{[F(z_{k})^{\top}(v_{k}-z_{k})]^{2}}{\|F(z_{k})\|^{2}} \nonumber \\ &\leq& \|v_{k}-x^{\ast}\|^{2}-\gamma_{k}(2-\gamma_{k})\frac{{\sigma}^2 \mu^{2}_1\|v_{k}-z_{k}\|^{4}}{\|F(z_{k})\|^{2}} \nonumber \\ &\leq& \|v_{k}-x^{\ast}\|^{2}-\underline{\gamma} (2-\overline{\gamma} )\frac{{\sigma}^2 \mu^{2}_1\|v_{k}-z_{k}\|^{4}}{\|F(z_{k})\|^{2}}, \end{matrix}$

其中第一个不等式由 (3.1) 式放缩得到. 再根据 $0 < \underline{\gamma} \leq \overline{\gamma} <2$ 和注 2.2, 有

$\begin{eqnarray*} \|x_{k+1}-x^{\ast}\| &\leq& \|v_{k}-x^{\ast}\| = \| x_k+\phi_k(x_k-x_{k-1}) + \psi_k(x_{k-1} -x_{k-2}) -x^{\ast}\| \\ &\leq & \|x_{k}-x^{\ast}\| + \phi_{k}\|x_k-x_{k-1}\| + \psi_{k}\|x_{k-1} -x_{k-2}\| \leq \|x_{k}-x^{\ast}\|+ \epsilon_k+ \hat{\epsilon}_k. \end{eqnarray*}$

结合 $\sum\limits_{k=0}^{+\infty} \epsilon_k < +\infty$, $\sum\limits_{k=0}^{+\infty} \hat{\epsilon}_k < +\infty$, 以及引理 3.1 知 $\{\|x_{k}-x^{\ast}\|\}$ 收敛.

${\bf引理3.3}$$\{x_{k}\},\ \{v_{k}\}$$\{z_{k}\}$ 由算法 iITCGP 产生. 若假设 1 成立, 则

(i) 序列 $\{x_{k}\}$, $\{v_{k}\}$$\{z_{k}\}$ 有界;

(ii) $\lim\limits_{k\rightarrow +\infty}\|v_{k}-z_{k}\|= \lim\limits_{k\rightarrow +\infty}t_{k}\|d_{k}\|=0$ 成立.

${\bf证}$ (i) 由引理 3.2 知, $\{\|x_{k}-x^{\ast}\|\}$ 收敛, 从而 $\{\|x_{k}-x^{\ast}\|\}$ 有界. 又因 $x^{\ast}\in Sol_{F}$, 则 $\{x_k\}$ 有界. 因此, 存在正数 $N_1$$N_2$ 使得

$\begin{eqnarray*} \|x_k -x^{\ast}\| \leq N_1, \| x_k \| \leq N_2,~\forall \ k\geq0. \end{eqnarray*}$

此结合 $\phi_{k}$$\psi_{k}$ 的定义, 有

$\begin{eqnarray*} \|v_k\| &=& \| x_k+\phi_{k}(x_k-x_{k-1}) +\psi_{k}(x_{k-1}-x_{k-2}) \| \nonumber \\ &\leq& \| x_k\|+ \phi_{k}\|x_k-x_{k-1}\|+ \psi_{k}\|x_{k-1}-x_{k-2}\| \nonumber \\ &\leq& N_2+ \epsilon_k+ \hat{\epsilon}_k < N_2+2 \epsilon, \end{eqnarray*}$

$\{v_k\}$ 有界. 根据 $F$ 的连续性, $\{F(v_k)\}$ 有界. 此结合信赖域性质 (2.12), 得 $\{d_k\}$ 有界. 再利用 $z_k= v_k+t_k d_k$$t_k\in (0, \varsigma]$ 知, $\{z_k\}$ 亦有界;

(ii) 考虑到 $x^{\ast}\in Sol_{F}$, $v_k$ 的定义, 柯西-施瓦茨不等式和注 2.2, 得

$\begin{matrix}\label{3.14} \nonumber \|v_k-x^*\|^2 &=& \|x_k+\phi_{k}(x_k-x_{k-1}) +\psi_{k}(x_{k-1}-x_{k-2}) -x^* \|^2 \nonumber \\ &=& \|x_k-x^*\|^2+2 (x_k-x^*)^{\top} [\phi_k (x_{k}-x_{k-1} )+\psi_k (x_{k-1}-x_{k-2} ) ] \nonumber \\ && + \|\phi_k (x_k-x_{k-1} )+\psi_k (x_{k}-x_{k-2} )\|^2 \nonumber \\ & \leq& \|x_k-x^*\|^2+2 \|x_k-x^*\| [ \phi_k \|x_{k-1}-x_{k-1}\| +\psi_k \|x_{k-1}-x_{k-2}\| ] \nonumber \\ && + 2(\phi_k \|x_k-x_{k-1}\|)^2 + 2 (\psi_k \|x_{k-1}-x_{k-2}\|)^2 \nonumber \\ & \leq& \|x_k-x^*\|^2+ 2 N_1 (\epsilon_k + \hat{\epsilon}_k) + 2 \epsilon_k^2 + 2 \hat{\epsilon}_k^2. \end{matrix}$

将上述 (3.3) 式代入 (3.2) 式, 有

$\begin{matrix}\label{3.15} \|x_{k+1}-x^{\ast}\|^{2} \leq \|x_{k}-x^{\ast}\|^{2}+ \left[2 N_1 (\epsilon_k + \hat{\epsilon}_k) + 2 \epsilon_k^2 + 2 \hat{\epsilon}_k^2\right] -\underline{\gamma} (2-\overline{\gamma} ) \frac{{\sigma}^2{\mu^{2}_1}\|v_{k}-z_{k}\|^{4}}{\|F(z_{k})\|^{2}}. \end{matrix}$

另一方面, 根据 $F$ 的连续性和 $\{z_k\}$ 的有界性, 知 $\{F(z_k)\}$ 有界, 即对任意 $k\geq 0$, 存在正数 $N_3$ 使得 $\|F(z_{k})\|\leq N_3$.

再利用 $\sum\limits_{k=0}^{+\infty} \epsilon_k < +\infty$$\sum\limits_{k=0}^{+\infty} \hat{\epsilon}_k < +\infty$, 则由 (3.4) 式得

$\begin{eqnarray*} && \underline{\gamma} (2-\overline{\gamma} ) \frac{\sigma^{2}{\mu^{2}_1}}{N_3^{2}} \sum\limits_{k=0}^{+\infty}\|v_{k}-z_{k}\|^{4} \\ & \leq& \sum\limits_{k=0}^{+\infty} \left[\|x_{k}-x^{\ast}\|^{2}+ (2 N_1 (\epsilon_k + \hat{\epsilon}_k) + 2 \epsilon_k^2 + 2 \hat{\epsilon}_k^2) -\|x_{k+1}-x^{\ast}\|^{2} \right]\\ & =& \|x_0-x^\ast\|^2-\lim\limits_{k\rightarrow+\infty}\|x_{k+1}-x^{\ast}\|^{2} + \sum\limits_{k=0}^{+\infty} (2 N_1 (\epsilon_k + \hat{\epsilon}_k) + 2 \epsilon_k^2 + 2 \hat{\epsilon}_k^2) < + \infty, \end{eqnarray*}$

于是有 $\lim\limits_{k\rightarrow +\infty}\|v_{k}-z_{k}\| = \lim\limits_{k\rightarrow +\infty}t_k\|d_{k}\|=0$.

${\bf定理3.1}$$\{x_{k}\},\ \{v_{k}\}$$\{z_{k}\}$ 由算法 iITCGP 产生. 若假设 1 成立, 则

(i) $\liminf\limits_{k\rightarrow +\infty}\|F(v_{k})\|=0;$

(ii) $\{x_{k}\},\ \{v_{k}\}$$\{z_{k}\}$ 均收敛到问题 (1.1) 的解.

${\bf证}$ (i) 反证法. 若结论 (i) 不成立, 则存在 $\Lambda >0$ 使得

$\begin{eqnarray*} \|F(v_{k})\| \geq \Lambda,\ \forall \ k\geq 0. \end{eqnarray*}$

再利用 (2.12) 式知, 对任意 $k\geq 0$, 有 $\|d_{k}\| \geq \left(1-\frac{(1+\bar{\chi})^{2}}{4}\right) \|F(v_{k})\| \geq \left(1-\frac{(1+\bar{\chi})^{2}}{4}\right) \Lambda > 0$. 结合引理 3.3 (ii), 有 $\lim\limits_{k\rightarrow +\infty}t_k=0.$ 根据序列 $\{d_{k}\}$$\{v_{k}\}$ 的有界性 (见引理 3.3 (i) 及其证明), 存在两个收敛的子序列 $\{d_{k_{j}}\}$$\{v_{k_{j}}\}$ 使得 $\lim\limits_{j\rightarrow +\infty,j\in \mathcal{K}}d_{k_{j}} = \tilde{d}, \ \lim\limits_{j\rightarrow +\infty,j\in \mathcal{K}}v_{k_{j}} = \tilde{v},$ 其中 $\mathcal{K}$ 为无限指标集. 由 (2.11) 式得

$-{F(v_{k_{j}})}^{\top}d_{k_{j}} \geq \left(1-\frac{(1+\bar{\chi})^{2}}{4}\right) {\|F(v_{k_{j}})\|^{2}},~\forall~j \in \mathcal{K}.$

上式中令 $j \rightarrow + \infty$, 结合 $F$ 的连续性, 有

$\begin{matrix}\label{3.18} -{F(\tilde{v})}^{\top}\tilde{d} \geq \left(1-\frac{(1+\bar{\chi})^{2}}{4}\right) {\|F(\tilde{v})\|^{2}} \geq \left(1-\frac{(1+\bar{\chi})^{2}}{4}\right) \Lambda^{2} > 0. \end{matrix}$

另一方面, 根据线搜索 (2.10) 式知

$\begin{eqnarray*} -F(v_{k_{j}}+{\rho}^{-1}{t}_{k_j}d_{k_{j}})^{\top}d_{k_{j}} < \sigma {\rho}^{-1} {t}_{k_j} \mathcal{P}_{[\mu_1,\mu_2]}[\|F(v_{k_{j}}+{\rho}^{-1}{t}_{k_j}d_{k_{j}})\|] \|d_{k_{j}}\|^{2}. \end{eqnarray*}$

类似的, 上式中令 $j \rightarrow + \infty$, 根据 $F$ 的连续性, 得 $-{F(\tilde{v})}^{\top}\tilde{d} \leq 0$. 此与不等式(3.5) 产生矛盾. 因此, $\liminf\limits_{k\rightarrow +\infty}\|F(v_{k})\|=0$;

(ii) 考虑到 $\{v_k = x_k+\phi_k(x_k-x_{k-1}) + \psi_k(x_{k-1} -x_{k-2})\}$, $\phi_{k}$$\psi_{k}$ 的定义, 以及注 2.2, 知

$\begin{eqnarray*} \|x_{k}- v_{k}\| &=& \|x_{k}-(x_k+\phi_k(x_k-x_{k-1}) + \psi_k(x_{k-1} -x_{k-2})) \| \nonumber \\ &=& \phi_{k}\|x_k-x_{k-1}\| + \psi_{k}\|x_{k-1} -x_{k-2}\| \leq \epsilon_k + \hat{\epsilon}_k. \end{eqnarray*}$

上式两边同时取极限, 结合 $\lim\limits_{k\rightarrow +\infty} \epsilon_k =0$$\lim\limits_{k\rightarrow +\infty} \hat{\epsilon}_k =0$$\lim\limits_{k\rightarrow +\infty} \|x_{k}-v_{k}\|=0$.

$ \lim\limits_{i\rightarrow +\infty} v_{k_i}= v^{*}, \ \lim\limits_{i\rightarrow +\infty} x_{k_i}= v^{*},\ \lim\limits_{i\rightarrow +\infty}\|F(v_{k_i})\|=\|F(v^{*})\|=0,$

从而 $\lim\limits_{i\rightarrow +\infty}\|F(x_{k_i})\|=\|F(v^{*})\|=0$.$x^{*} := v^{*} \in Sol_{F}$, 利用序列 $\{\|x_k-x^{\ast}\|\}$ 收敛, 得

$ \lim\limits_{k\rightarrow +\infty} \|x_{k}-x^{*}\| =\lim\limits_{i\rightarrow +\infty} \|x_{k_i}-x^{*}\| =\lim\limits_{i\rightarrow +\infty} \|x_{k_i}- v^{*}\| =0. $

因此, $\{x_{k}\}$ 收敛至 $x^{*} \in Sol_{F}$, 且 $\{v_{k}\}$ 亦收敛至 $x^{*}$. 此结合引理 3.3 (ii) 知 $\lim\limits_{k\rightarrow + \infty} \|z_{k}- v_{k}\| = 0$, 进而 $\{z_{k}\}$ 同样收敛至 $x^{*}$.

4 数值试验

为验证算法 iITCGP 的数值有效性, 该节将其应用于求解 10 个无约束非线性方程组问题. 数值试验分为两部分, 即试验 I 和试验 II, 所有程序均使用 Matlab R2022a 实现, 在 Lenovo 笔记本电脑上执行, 其配置为 Intel(R) Core(TM) i5-1135G7 CPU (2.40 GHz), 16 GB 内存.

$\textbf{试验 I}$ 对于算法 iITCGP 中的 $p_k$, 赋予其四种选取方式, 即 $\bar{y}_{k-1} = F(v_{k})-F(v_{k-1})$, $F(v_{k})$, $F(v_{k-1})$ 以及 $d_{k-1}$, 并将所对应算法分别称为 iITCGP1, iITCGP2, iITCGP3 和 iITCGP4. 四种算法均使用相同参数取值, 具体为: $\sigma=0.001$, $\varsigma=0.45$, $\rho=0.43$, $\gamma_k \equiv 1.99$, $\phi=0.01$, $\psi=0.01$, $\mu_1=0.001$, $\mu_2=0.8$, $\tau=0.99$, $\bar{\chi}=0.5$, $\epsilon_k=\hat{\epsilon}_k=\frac{1}{k^2}$ (若 $k=0$, $\epsilon_k=\hat{\epsilon}_k= 1$).

$\textbf{试验 II}$ 将试验 I 中表现较好的算法 (后续结果显示为算法 iITCGP2) 与现有其它三种惯性算法做比较, 所比较方法分别为 IM3TFR2[9], MITTCGP [18]和 ISTCP [10], 其搜索方向和线搜索参数的取值均与原文献保持一致.

$F(x)=\left(f_{1}(x), f_{2}(x), \dots, f_{n}(x)\right)^{\top}$, 所测试的 10 个问题具体如下

$\textbf{问题 1}$[问题 5]

$\begin{eqnarray*} &&f_{1}(x)= x_1-{\rm e}^{\cos(\frac{x_1+x_2}{n+1})}, \\ &&f_{i}(x)= x_i-{\rm e}^{\cos(\frac{x_{i-1}+x_i+x_{i+1}}{n+1})},\ i=2,3,\dots,n-1,\\ &&f_{n}(x)= x_n-{\rm e}^{\cos(\frac{x_{n-1}+x_n}{n+1})}. \end{eqnarray*}$

$\textbf{问题 2}$[问题 10]

$\begin{eqnarray*} &&f_{1}(x)= x_1-{\rm e}^{\cos(\frac{x_1+x_2}{2})}, \\ &&f_{i}(x)= x_i-{\rm e}^{\cos(\frac{x_{i-1}+x_i+x_{i+1}}{i})},\ i=2,3,\dots,n-1,\\ &&f_{n}(x)= x_n-{\rm e}^{\cos(\frac{x_{n-1}+x_n}{n})}. \end{eqnarray*}$

$\textbf{问题 3}$[问题 7]

$\begin{eqnarray*} F(x)= \left(\begin{array}{ccccc} 5/2 & 1 & 0 & \ldots & 0 \\ 1 & 5/2 & 1 & \ldots & 0 \\ 0 & 1 & 5/2 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 1 & 5/2 \\ \end{array}\right)x +(-1,\ldots,-1)^{\top}. \end{eqnarray*}$

$\textbf{问题 4}$ 改自文献 [问题 7]. 令

$\begin{eqnarray*} F(x)= \left(\begin{array}{ccccc} 2 & -1 & 0 & \ldots & 0 \\ 0 & 2 & -1 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & \ldots & 0 & 2 & -1 \\ 0 & 0 & 0 & 0 & 2 \\ \end{array}\right)x +(\sin x_1-1,\ldots,\sin x_n-1)^{\top}. \end{eqnarray*}$

$\textbf{问题 5}$[问题 8]

$\begin{align*} f_{1}(x)&=x_1(x_1^2+x_2^2)-1, \\ f_{i}(x)&=x_i(x_{i-1}^2+2x_i^2+x_{i+1}^2)-1,\ i=2,3,\dots,n-1,\\ f_{n}(x)&=x_n(x_{n-1}^2+x_n^2). \end{align*}$

$\textbf{问题 6}$[问题 28]

$\begin{align*} f_{1}(x)&= 2x_1+0.5h^2(x_1+h)^3-x_2, \\ f_{i}(x)&= 2x_i+0.5h^2(x_i+ih)^3-x_{i-1}+x_{i+1}, \ i=2,\dots,n-1,\\ f_{n}(x)&= 2x_n+0.5h^2(x_n+nh)^3-x_{n-1}, \end{align*}$

其中 $h=\frac{1}{n+1}$.

$\textbf{问题 7}$[问题 3]

$\begin{align*} f_{1}(x)&=2x_{1}-x_{2}+{\rm e}^{x_1}-1, \\ f_{i}(x)&=-x_{i-1}+2x_i-x_{i+1}+{\rm e}^{x_i}-1,\ i=2,3,\dots,n-1,\\ f_{n}(x)&=-x_{n-1}+2x_n+{\rm e}^{x_n}-1. \end{align*}$

$\textbf{问题 8}$ 改自文献 [问题 5]. 令 $f_{i}(x)=({\rm e}^{x_i})^2+3\sin x_i\cos x_i-1,\ i=1,2,3,\dots,n.$

$\textbf{问题 9}$ 改自文献 [问题 8]. 令

$\begin{align*} f_{1}(x)&={\rm e}^{x_1}-1, \\ f_{i}(x)&={\rm e}^{x_i}+x_i-1,\ i=2,3,\dots,n. \end{align*}$

$\textbf{问题 10}$[问题 16]$f_{i}(x)= \frac{i}{n}{\rm e}^{x_i}-1,\ i=1,2,3,\dots,n.$

对上述 10 个测试问题分别取 5 个不同维度 ($n=1000, \ 5000, \ 10000, \ 50000$$100000$) 进行求解, 表 1 列出算法所需不同初始点的选取情况. 所有算法满足如下一种终止准则时, 迭代停止. (i) $\|d_k\| \leq 10^{-7}$; (ii) $\|F_k\| \leq 10^{-6}$, 此处 $F_k$$F(x_k)$, $F(v_k)$$F(z_k)$.

表 1   7 种所需不同初始点的选取情况

新窗口打开| 下载CSV


试验 I 和试验 II 的具体测试结果见 https://www.cnblogs.com/kekoukelele/p/18470451. 其中, "Case" 表示 7 种不同初始点选择情形, "n" 表示测试问题维度除以 $10^3$ 后的取值, "Itr" 表示迭代次数, "NF" 表示函数 $F$ 的计算次数, "Time" 表示计算时间 (单位为秒), "$\|F^{*}\|$" 表示程序终止时 $\|F_k\|$ 的最终值. 若该算法未能解决此问题, 则将 "Itr/NF/Time/$\|F^{*}\|$" 记为 "NaN/NaN/NaN/NaN". 为较直观比较各算法的数值性能, 采用 Dolan 和 Moré[5] 提出的性能曲线图以展示每种算法关于三个指标的计算结果 (即计算时间, 函数 $F$ 的计算次数, 迭代次数). 一般而言, 曲线越居于上方, 其对应算法的数值表现越好.

通过以上试验结果, 可得出如下结论.

(i) 对于试验 I, 从图 1-3中可观察到, $p_{k}$$F(v_{k})$ 时对应的算法 iITCGP2 在性能图展示的三个指标方面数值表现都最好, $p_{k}$$\bar{y}_{k-1}$ 时对应的算法 iITCGP1 数值表现优于剩余的另外两种算法;

图1

图1   试验 I 中计算时间性能比较


图2

图2   试验 I 中函数 $F$ 计算次数性能比较


图3

图3   试验 I 中迭代次数性能比较


(ii) 对于测试 II, 结果见图 4-6. 多数情况下, 算法 iITCGP2 在迭代次数, 函数 $F$ 的计算次数以及算法计算时间上总体优于其它三种算法.

图4

图4   试验 II 中计算时间性能比较


图5

图5   试验 II 中函数 $F$ 计算次数性能比较


图6

图6   试验 II 中迭代次数性能比较


参考文献

Awwal A M, Kumam P, Sitthithakerngkiet K, et al.

Derivative-free method based on DFP updating formula for solving convex constrained nonlinear monotone equations and application

AIMS Math, 2021, 6(8): 8792-8814

Bing Y, Lin G.

An efficient implementation of Merrill's method for sparse or partially separable systems of nonlinear equations

SIAM J Optimiz, 1991, 1(2): 206-221

Cheng W Y.

A PRP type method for systems of monotone equations

Math Comput Modell, 2009, 50: 15-20

[本文引用: 1]

Cruz W L.

A spectral algorithm for large-scale systems of nonlinear monotone equations

Numer Algorithms, 2017, 76: 1109-1130

Dolan E D, Moré J J.

Benchmarking optimization software with performance profiles

Math Program, 2002, 91(2): 201-213

[本文引用: 1]

Dai Y H, Yuan Y X.

A nonlinear conjugate gradient method with a strong global convergence property

SIAM J Optim, 1999, 10(1): 177-182

[本文引用: 1]

Gao P T, He C J.

An efficient three-term conjugate gradient method for nonlinear monotone equations with convex constraints

Calcolo, 2018, 55(4): Article 53

Hu Y P, Wei Z X.

A modified Liu-Storey conjugate gradient projection algorithm for nonlinear monotone equations

Int J Math, 2014, 9: 1767-1777

Ibrahim A H, Kumama P, Rapajić S, et al.

Approximation methods with inertial term for large-scale nonlinear monotone equations

Appl Numer Math, 2022, 181: 417-435

[本文引用: 2]

Ibrahim A H, Kumam P, Sun M, et al.

Projection method with inertial step for nonlinear equations: Application to signal recovery

J Ind Manag Optim, 2023, 19(1): 30-55

[本文引用: 2]

Jian J B, Yin J H, Tang C M, et al.

A family of inertial derivative-free projection methods for constrained nonlinear pseudo-monotone equations with applications

Comput Appl Math, 2022, 41(7): Article 309

[本文引用: 1]

Kou C X, Dai Y H.

A modified self-scaling memoryless Broyden-Fletcher-Goldfarb-Shanno method for unconstrained optimization

J Optim Theory Appl, 2015, 165: 209-224

[本文引用: 1]

La Cruz W, Martínez J M, Raydan M.

Spectral residual method without gradient information for solving large-scale nonlinear systems of equations

Math Comput, 2006, 75(255): 1429-1448

Li M.

A modified Hestense-Stiefel conjugate gradient method close to the memoryless BFGS quasi-Newton method

Optim Methods Softw, 2018, 33(2): 336-353

[本文引用: 4]

刘金魁, 孙悦, 赵永祥.

凸约束伪单调方程组的无导数投影算法

计算数学, 2021, 43(3): 388-400

DOI:10.12286/jssx.j2020-0659      [本文引用: 1]

Based on the structure of the HS conjugate gradient method, we propose an iterative projection algorithm for solving nonlinear pseudo-monotone equations with convex constraints under one weak assumption. Since the proposed method does not need any gradient or Jacobian matrix information of equations, it is suitable to solve large-scale problems. The proposed algorithm generates a sufficient descent direction in per-iteration, which is independent of any line search. Moreover, the global convergence and R-linear convergence rate of the proposed method are proved without the assumption that nonlinear equations satisfies Lipschitz condition.The numerical results show that the proposed method is stable and effective for the given large-scale nonlinear equations with convex constraints.

Liu J K, Sun Y, Zhao Y X.

A derivative-free projection algorithm for solving pseudo-monotone equations with convex constraints

Math Numer Sin, 2021, 43(3): 388-400

DOI:10.12286/jssx.j2020-0659      [本文引用: 1]

Based on the structure of the HS conjugate gradient method, we propose an iterative projection algorithm for solving nonlinear pseudo-monotone equations with convex constraints under one weak assumption. Since the proposed method does not need any gradient or Jacobian matrix information of equations, it is suitable to solve large-scale problems. The proposed algorithm generates a sufficient descent direction in per-iteration, which is independent of any line search. Moreover, the global convergence and R-linear convergence rate of the proposed method are proved without the assumption that nonlinear equations satisfies Lipschitz condition.The numerical results show that the proposed method is stable and effective for the given large-scale nonlinear equations with convex constraints.

Liu P J, Shao H, Yuan Z H, et al.

A family of three-term conjugate gradient projection methods with a restart procedure and their relaxed-inertial extensions for the constrained nonlinear pseudo-monotone equations with applications

Numer Algorithms, 2023, 94: 1055-1083

[本文引用: 1]

Liu P J, Wu X Y, Shao H, et al.

Three adaptive hybrid derivative-free projection methods for constrained monotone nonlinear equations and their applications

Numer Linear Algebra Appl, 2023, 30(2): e2471

[本文引用: 1]

Ma G D, Jin J C, Jian J B, et al.

A modified inertial three-term conjugate gradient projection method for constrained nonlinear equations with applications in compressed sensing

Numer Algorithms, 2023, 92(3): 1621-1653

[本文引用: 2]

Moré J J, Garbow B S, Hillstrom K E.

Testing unconstrained optimization software

ACM Trans Math Softw, 1981, 7(1): 17-41

Narushima Y, Yabe H, Ford J A.

A three-term conjugate gradient method with sufficient descent property for unconstrained optimization

SIAM J Optim, 2011, 21(1): 212-230

[本文引用: 2]

Nocedal J.

Updating quasi-Newton matrices with limited storage

Math Comput, 1980, 35(151): 773-782

[本文引用: 1]

Polyak B T.

Some methods of speeding up the convergence of iteration methods

USSR Comput Math Math Phys, 1964, 4(5): 1-17

[本文引用: 1]

Polyak B T. Introduction to Optimization. New York: Inc Publications Division, 1987

[本文引用: 1]

Rao J Y, Huang N.

A derivative-free scaling memoryless DFP method for solving large scale nonlinear monotone equations

J Glob Optim, 2023, 87(2): 641-677

Shanno D F. Conjugate gradient methods with inexact searches, Math Oper Res, 1978, 3(3): 244-256

[本文引用: 1]

Solodov M V, Svaiter B F.

A globally convergent inexact Newton method for systems of monotone equations//Fukushima M, Qi L. Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods

Dordrecht: Kluwer, 1998. 355-369

[本文引用: 1]

Song T Y, Liu Z X.

An efficient inertial subspace minimization CG algorithm with convergence rate analysis for constrained nonlinear monotone equations

J Comput Appl Math, 2024, 446: 115873

[本文引用: 1]

Stanimirović P S, Ivanov B, Djordjević S, et al.

New hybrid conjugate gradient and Broyden-Fletcher-Goldfarb-Shanno conjugate gradient methods

J Optim Theory Appl, 2018, 178: 860-884

[本文引用: 1]

Wu X Y, Shao H, Liu P J, et al.

An inertial spectral CG projection method based on the memoryless BFGS update

J Optim Theory Appl, 2023, 198: 1130-1155

[本文引用: 1]

尹江华, 简金宝, 江羡珍.

凸约束非光滑方程组基于自适应线搜索的谱梯度投影算法

计算数学, 2020, 42(4): 457-471

DOI:10.12286/jssx.2020.4.457      [本文引用: 1]

Based on three classic line search techniques for finding separating hyperplane, this paper proposes an adaptive line search method. Combining this with the spectral gradient projection method, a spectral gradient projection algorithm for nonsmooth monotone equations with convex constraints is proposed. The proposed method does not calculate and store any matrix, so it is suitable for solving large-scale nonsmooth monotone nonlinear equations. Under mild conditions, the global convergence of the proposed method is proved, and its rate of convergence is analyzed. Numerical experiments show that the proposed algorithm is efficient and robust.

Yin J H, Jian J B, Jiang X J.

A spectral gradient projection algorithm for convex constrained nonsmooth equations based on an adaptive line search

Math Numer Sin, 2020, 42(4): 457-471

DOI:10.12286/jssx.2020.4.457      [本文引用: 1]

Based on three classic line search techniques for finding separating hyperplane, this paper proposes an adaptive line search method. Combining this with the spectral gradient projection method, a spectral gradient projection algorithm for nonsmooth monotone equations with convex constraints is proposed. The proposed method does not calculate and store any matrix, so it is suitable for solving large-scale nonsmooth monotone nonlinear equations. Under mild conditions, the global convergence of the proposed method is proved, and its rate of convergence is analyzed. Numerical experiments show that the proposed algorithm is efficient and robust.

Yin J H, Jian J B, Jiang X Z, et al.

A hybrid three-term conjugate gradient projection method for constrained nonlinear monotone equations with applications

Numer Algorithms, 2021, 88: 389-418

[本文引用: 2]

Yin J H, Jian J B, Jiang X Z, et al.

A family of inertial-relaxed DFPM-based algorithms for solving large-scale monotone nonlinear equations with application to sparse signal restoration

J Comput Appl Math, 2023, 419: Art 114674

[本文引用: 1]

Yu G H, Niu S Z, Ma J H, et al.

An adaptive prediction-correction method for solving large-scale nonlinear systems of monotone equations with applications

Abstr Appl Anal, 2013, 2013: 1-13

Zhang L, Zhou W J.

Spectral gradient projection method for solving nonlinear monotone equations

J Comput Appl Math, 2006, 196: 478-484

[本文引用: 1]

张宁, 刘金魁.

非线性伪单调方程组的谱 LS 型投影算法

数学物理学报, 2022, 42A(6): 1886-1897

[本文引用: 1]

Zhang N, Liu J K.

Spectral LS-type projection algorithm for solving nonlinear pseudo-monotone equations

Acta Math Sci, 2022, 42A(6): 1886-1897

[本文引用: 1]

Zhang N, Liu J K, Zhang L Q, et al.

A fast inertial self-adaptive projection based algorithm for solving large-scale nonlinear monotone equations

J Comput Appl Math, 2023, 426: Art 115087

[本文引用: 1]

Zhou W J, Shen D M.

Convergence properties of an iterative method for solving symmetric non-linear equations

J Optim Theory Appl, 2015, 164: 277-289

/