Selflearning algorithm

Generally, we define a suitable error function using neural networks for reference, so we can adjust the control parameters by BP algorithm on-line. As is known, an AUV has its own motion will, which is very important for self-learning and will be discussed in detail in the next section, so there is also an expected motion state. Namely, there is an expected control output for S surface controller. Therefore, the error function is given by

where ud is the expected control output, and u is the last time output which can be obtained by eqution (34) .

We can use gradient descent optimization method, i.e. use the gradient of Ep to adjust fe1 and k2.

dki where n is the learning ratio (0 < n < 1 ).

Therefore, k1 and k2 can be optimized by the following eqution.

kt (t+1)=kt (t)+a^ = kt (t)+n(u - u) • -—-—j- • e (38)

We can get the expected speed by expected state programming. The expected control output can be obtained by the following principles.

If the speed v is less than or equal to vd, then u is less than ud, and u needs to be magnified. In the contrast, u needs to be reduced. The expected control output is given by ud = u + c ■ (vd - v) (39)

where c is a proper positive constant. Therefore, S surface controller has the ability of self-learning. 