登录

双语推荐:近似动态规划

针对动态规划存在的"维数灾"问题,提出了一种求解大规模电力系统机组组合(unit commitment,UC)问题的策略迭代近似动态规划(policy iteration-approximate dynamic programming,PI-ADP)方法。采用策略迭代对动态规划过程中的值函数进行近似,替代了从可行状态中精确计算值函数的过程,避免了"维数灾"的发生。在值函数的近似过程中,利用了实际系统的运行约束有效压缩状态空间,减少可选择的启停动作,进一步降低了计算量。10~1 000机96时段系统的计算结果表明,所提方法能在较少时间内获得高质量的解,从而为求解大规模电力系统UC问题提供了参考。
ABSTRACT:To solve the ‘curses of dimensionality’ problem of dynamic programming, this paper proposed a policy iteration-approximate dynamic programming (PI-ADP) method for large scale power system unit commitment (UC) problems. Policy iteration was introduced to approximate the value functions in the dynamic programming procedure, replacing theprocess of calculating the value functions accurately from feasible states, and the ‘curses of dimensionality’was avoided. As approximating the value functions, the operation constraints of practical systems were used for compressing state space effectively and the alternative on/off actions were cut down, so that the calculation amount was reduced further. The results of the systems ranging in size of 10 to 1000 units 96 times show that a high quality solution can be achieved in less time by applying the proposed method. It provides reference for solving the large scale power system UC problems.

[ 可能符合您检索需要的词汇 ]

采用近似动态规划(ADP)方法对钢铁物流运输过程中的车辆调度问题进行了分析,设计了车辆和运输货物的状态向量空间、动作向量空间等,充分考虑运输成本和能力约束,建立状态转移函数、目标函数,并对近似动态规划算法进行改进。在基于决策后状态的ADP算法的基础上,采用Boltzmann探索策略对所有的状态空间进行遍历,避免局部最优和低效问题。通过对比实验,比较Q学习算法、基于决策后状态的ADP算法以及采用Boltzmann探索策略的ADP算法的实验结果,证明了采用Boltzmann探索策略的ADP算法具有更快的收敛速度,执行效率更高。
This paper analyzed the transportation scheduling problem with backhauls for iron and steel logistics industry using approximate dynamic programming, and designed the state vector space and action vector space of logistic vehicles and loads, and defined state transfer function and objective function in consideration of transportation cost and capacity constraint, and improved the approximate dynamic programming algorithm. On the basis of the ADP algorithm based on the post-decision state, it searched all the state space using Boltzmann explore strategy and avoid the local optimum and inefficient problem. Through the experiments, it analyzed and compared the test result of the Q-Learning algorithm, the normal ADP algorithm based on post-decision state and the improved ADP algorithm based on Boltzmann explore strategy, and prove that the improved ADP algorithm using Boltzmann explore strategy has faster convergence speed and higher running efficiency.

[ 可能符合您检索需要的词汇 ]

为了解决内河海事无人艇路径规划问题,提出了一种基于电子江图的路径遍历算法。该算法以分层的电子江图为基础,运用全局路径规划和局部路径规划的方法寻找近似可航路径。运用栅格法在复杂多变的内河环境中选择可航区域,运用Voronoi图对动态物标或可视为质点的碍航物建立航行路径集;将可航区域(或轻微碍航区域)与航行路径集公共区域记为可航路径;并运用贝塞尔曲线和二次规划数学方法进行优化。Matlab仿真结果表明,当障碍物位置坐标不同或目的地位置坐标不同时均可以生成近似可航路径;生成的不同近似可航路径均能被优化为最优安全可航路径,所以建立的环境模型以及使用的路径规划算法是有效、可行的。
In order to solve the problem of path-planning for inland maritime unmanned surface vessel, a traversal algorithm based on inland electronic chart is presented. The grids method in the algorithm is selected to choose navigation area based on the layered inland electronic chart, using global path planning and local path planning method to find approximate navigable path. The Voronoi diagram is selected to establish the navigation path set for dynamic objects or obstacles which can be regarded as particles. Bezier curves and quadratic programming mathematical methods are used to optimize the path of the navigable which is the public area of navigation area(or minor hinders navigation area)and the navigation path set. Matlab based simula-tions demonstrate that when the obstacle position coordinates or destination location coordinates can generate approximate navi-gable path. Generating different approximate navigable path can be optimized for optimal safety navigable path, so the construc

[ 可能符合您检索需要的词汇 ]

本文针对离散混沌系统提出了一种基于近似动态规划方法的最优能量控制。针对混沌系统的特性,该方法提出了最优能量控制的自适应迭代方案,克服了传统动态规划方法所产生的“维数灾”问题,并给出了完整的理论证明。证明结果显示,当迭代步数足够大,该迭代算法能够收敛到最优性能指标函数,且能够得到最优的控制策略。在仿真试验中,研究了Henon映射的最优控制问题,仿真结果验证了本文提出迭代控制方案的有效性和可行性。
In this paper the optimal energy control for a class of chaotic systems based on adaptive dynamic programming is proposed. The curse of dimensionality emerging in dynamic programming is overcomed by the proposed adaptive iteration scheme. The detailed proof is given. When the iteration step is large enough, the iteration algorithm can be reach to the optimal performance index function, and the optimal control policy is obtained. In the simulation part, the optimal control for Henon mapping is studied, the similation results show that the iteration algorithm is effective and feasible.

[ 可能符合您检索需要的词汇 ]

针对实际工业常见的定点跟踪控制问题,通过数学变换,将原系统最优跟踪控制问题转化为新系统最优调节问题,以跟踪误差作为新系统的状态量,引入ε-自适应动态规划算法(ε-ADP)求解HJB方程,并以两个BP神经网络分别用于近似性能指标函数和最优控制,从而得到ε-最优跟踪控制。仿真实验表明,所设计的控制器可以在有限时间内将状态跟踪到目标值,并使得性能指标函数近似最优。
In order to deal with the common fix-point tracking control problems in actual industrial systems, a mathematical transformation is developed to change the original system optimal tracking control problem to an optimal regulator problem of a new system. The state variables of the new sys-tem are the tracking error. ε-adaptive dynamic programming (ε-ADP) is used to solve HJB equation while two BP neural networks are used to approximate the performance index function and optimal control. Thus ε-optimal tracking control is obtained. Simulation results show that the controller de-signed can track a state to the target and make the performance index function converges to optimal.

[ 可能符合您检索需要的词汇 ]

针对一类未知的连续非线性系统,提出一个基于单网络近似动态规划(ADP)的近似最优控制方案。该方案通过设计一个新型的递归神经网络(RNN)辨识器放松了系统模型需已知或部分已知的要求,并利用一个神经网络(NN)近似系统的性能指标函数消除了常规ADP方法中的控制网络。通过Lyapunov理论分析严格证明了闭环系统内所有信号一致最终有界,并且所获得的性能指标函数和控制输入分别收敛到最优性能指标函数和最优控制输入的小邻域内。仿真结果验证了所提出控制方案的有效性。
The near-optimal control scheme is proposed for a class of unknown continuous-time nonlinear systems based on single network approximate dynamic programming (ADP). The proposed scheme relaxes the requirement of the system model being known or partly known by designing a novel recurrent neural network(RNN) identifier, and eliminates the action network of ordinary ADP methods by employing a neural network(NN) to approximate the performance index function. By Lyapunov theory, it is proved that all the signals in the closed-loop system are ultimately uniformly bounded and the obtained optimal performance index function and control input lie in small neighborhoods of the optimal performance index function and the optimal control input, respectively. Simulation results demonstrate the effectiveness of the proposed scheme.
质子交换膜燃料电池(PEMFC)系统具有明显的非线性和时变的特质,因此质子交换膜燃料电池的建模和优化控制问题是研究的重点。通过建立单体质子交换膜燃料电池的近似线性动态模型,并在此模型基础上,设计了基于双启发式动态规划(DHP)的质子交换膜燃料电池神经网络优化控制器。仿真结果表明,此近似线性模型有效地简化了非线性和时变的特质,在此模型基础上所设计的神经网络控制器具有更好的控制效果和控制精度。
Proton exchange membrane fuel cell(PEMFC) system had obvious nonlinear and time-varying characteristics, so the study of proton exchange membrane fuel cellsystem neural network optimization control was necessary. The approximation linear dynamic model of a single proton exchange membrane fuel cellwas established. Then based on the approximation linear model, the optimization control er based on the dual heuristic dynamic programming (DHP) of was designed. Simulation results show that the nonlinear and time-varying characteristics are effectively simplified by this approximate linear model, and the proposed neural network control er has better control effect and control accuracy.
提出一种贪婪迭代DHP(Dual heuristic programming)算法,解决了一类控制受约束非线性系统的近似最优镇定问题.针对系统的控制约束,首先引入一个非二次泛函把约束问题转换为无约束问题,然后基于协状态函数提出一种贪婪迭代DHP算法以求解系统的HJB(Hamilton-Jacobi-Bellman)方程.在算法的每个迭代步,利用一个神经网络来近似系统的协状态函数,而后根据协状态函数直接计算系统的最优控制策略,从而消除了常规近似动态规划方法中的控制网络.最后通过两个仿真例子证明了本文提出的最优控制方案的有效性和可行性.
The near-optimal stabilization problem for nonlinear constrained systems is solved by greedy iterative DHP (Dual heuristic programming) algorithm. Considering the control constraint of the system, a nonquadratic functional is first introduced in order to transform the constrained problem into a unconstrained problem. Then based on the costate function, the greedy iterative DHP algorithm is proposed to solve the Hamilton-Jacobi-Bellman (HJB) equation of the system. At each step of the iterative algorithm, a neural network is utilized to approximate the costate function, and then the optimal control policy of the system can be computed directly according to the costate function, which removes the action network appearing in the ordinary approximate dynamic programming (ADP) method. Finally, two examples are given to demonstrate the validity and feasibility of the proposed optimal control scheme.
针对移动机器人在有大型障碍物和运动空间相对狭窄的复杂环境中,人工势场法(APF)容易出现反复震荡、路径规划时间较长以及大型障碍物附近避障困难的问题,提出了在结合边缘探测法的APF路径规划基础上,加入自适应动态步长调整算法来克服APF的上述缺陷,实现移动机器人在复杂环境下的平滑路径规划,在确保路径近似最优的同时提高APF算法的收敛速度和路经规划的避障性能。实验结果证明了上述方法的有效性。
When the obstacles are large, or the complex environment space is relatively narrow, Artificial Potential Field method (APF)is prone to appear repeated shocks, long time planning and obstacle avoidance of difficulty nearby the large obstacles. This paper presents an adaptive dynamic step length adjustment method based on the APF path planning which is combined with the edge detection method to overcome the proposed defects of APF, achieving mobile robot smooth path planning in the complex environment. Hence it can not only improve APF algorithm convergence speed and the safety of path planning, but at the same time ensure the approximate optimum path. Experiments are carried out by simulation to verify the effectiveness of the afore-mentioned methods.

[ 可能符合您检索需要的词汇 ]

本文考虑工件首先在单机上加工,完工的工件由一辆容量有限的车配送到指定客户的模型,目标是最小化makespan。对于工件物理大小相同的情况,我们考虑了常数个客户的情形,并且给出了一个多项式时间的动态规划算法。对于工件物理大小不同的情况,我们讨论了一类特殊的三个客户的情形,并给出了一个2-近似算法。
In this paper , we consider the scheduling problem in which the jobs are first processed by one single machine and then delivered in batches by a single vehicle with limited capacity to the respective customers .The goal is to minimize the makespan .For the identical job size case , we present a polynomial time algorithm when the number of customers is fixed .For the non-identical job sizes case , we consider a special case with three cus-tomers and develop a 2-approximation algorithm .

[ 可能符合您检索需要的词汇 ]