
To take the weights all the way from initialization to learned model.


iteratively optimizes by calling forward / backward and updating parameters.scaffolds the optimization bookkeeping and creates the training network for learning and test network(s) for evaluation.Nesterov’s Accelerated Gradient ( type: "Nesterov") and.Stochastic Gradient Descent ( type: "SGD"),.The responsibilities of learning are divided between the Solver for overseeing the optimization and generating parameter updates and the Net for yielding loss and gradients.

The solver orchestrates model optimization by coordinating the network’s forward inference and backward gradients to form parameter updates that attempt to improve the loss.
