So, the wiki article and many examples of gradient descent suggest that the way to calculate the function that is the fastest way to get to the local minimum is by iteratively approaching it in hopes it will converge after *enough* tries. This feels eeky as this also means that there always be an error + this looks like a rather inefficient way to solve the problem.

I am convinced of that there is always either single best or multiple equaly good functions to reach the local minimum, thus there must be a way to find them without any error (within reasonable limitations of how our computations may be precise).

I suspect there must be some direct calculation to find such function, rather then by repetitively trying different hypotheses.

To provide more data: suppose you have a list of conses let it be

- Code: Select all
`(defvar *data* '((x_0 . y_0) (x_1 . y_1) ... (x_i . y_i)))`

- Code: Select all
`(reduce #'+ (mapcar #'f *data*))`

- Code: Select all
`(defun f (x) (expt (- (car x) (+ b (* (cdr x) k))) 2))`

In other words, you need to find b and k to fit them into function f such as the sum of all results of the function applied to the testing set would be the smallest possible.