


Title: Distributed and Stochastic Machine
Learning on Big Data
Speaker: James Kwok
Abstract: On big data sets, it is often challenging to learn
the parameters in a machine learning model. A popular technique is
the use of stochastic gradient, which computes the gradient at a
single sample instead of over the whole data set. Another
alternative is distributed processing, which is particularly natural
when a single computer cannot store or process the whole data set.
In this talk, some recent extensions will be presented. For
stochastic gradient, instead of using the information from only one
sample, we incrementally approximate the full gradient by also using
old gradient values from the other samples. It enjoys the same
computational simplicity as existing stochastic algorithms, but has
faster convergence. As for existing distributed machine learning
algorithms, they are often synchronized and the system can move
forward only at the pace of the slowest worker. I will present an
asynchronous algorithm which requires only partial synchronization,
and updates from the faster workers can be incorporated more often
by the master. 
