Monthly Archives: March 2017

Machine Learning Notes/Pointers

Goodfellow, Bengio and Courville excellent, current book from MIT Press including machine learrning fundamentals. Free to read online.

This site links two libraries for deep learning computations:  Both are Python libraries; both build on numpy.

Common features:

matrix operations are coded with Python operations (+. *, etc) and the result is a thing that the library can be commanded to run.  In TensorFlow, coding the thing is called building a computational graph and commanding it to be run is called running the computational graph.

SHOW THIS BASIC USE OF TensorFlow

basic use of Theano

calculations can be run in a GPU

graphical or network display of  a neural network architecture.

deep learning demo codes.

TensorFlow (python library) tutorial on deep convolutional neural networks. Provides a small implementation with a GPU verison.

TensorFlow quick start including a complete gradient descent example.

TensorFlow’s CIFAR10 tutorial( a image classification demo small enough to be easily experimented with)

CIFAR10, Alex Krizhevsky’s page

Krizhevsky’s tech report

on github Actively maintained (March 2017), just released v1.0

Installation

NVIDIAs instructions: http://www.nvidia.com/object/gpu-accelerated-applications-tensorflow-installation.html

Use pip, python installer alternative to easy_install https://pip.pypa.io/en/stable/ Ubuntu python-pip package brought in:

python-all (2.7.11-1) …
Setting up python-all-dev (2.7.11-1) …
Setting up python-pip-whl (8.1.1-2ubuntu0.4) …
Setting up python-pip (8.1.1-2ubuntu0.4) …
Setting up python-setuptools (20.7.0-1) …
Setting up python-wheel

Theano tutorial page

on github Actively maintained (March 2017)

Machine Learning Frameworks

  1. TensorFlow https://www.tensorflow.org/
  2. Theano
  3. Caffe
  4. Caffe2 Caffe2 link https://caffe2.ai/ NVIDIA link with online labs
  5. Microsoft CNTK https://www.microsoft.com/en-us/research/product/cognitive-toolkit/. CNTK setup: https://github.com/Microsoft/CNTK/wiki/Setup-CNTK-on-your-machine
  6. http://torch.ch/
  7. https://github.com/dmlc/mxnet
  8. NVIDIA says “define by run” rather than “define and run” http://chainer.org/

Links

  1. NVIDIA’s page of Frameworks supporting cuDNN, see bottom for more links “There are several other deep learning frameworks that leverage the Deep Learning SDK, including BidMach, Brainstorm, Kaldi, MatConvNet, MaxDNN, Deeplearning4j, Keras, Lasagne(Theano), Leaf, and more.” Frameworks supporting cuDNN (NVIDIA link) https://developer.nvidia.com/deep-learning-frameworks
  2. http://www.kdnuggets.com/2016/04/top-15-frameworks-machine-learning-experts.html
  3. http://opensourceforu.com/2017/01/best-open-source-machine-learning-frameworks/
  4. https://github.com/josephmisiti/awesome-machine-learning

Machine Learning toolkit(s) higher level than Theano and TensorFlow

Keras.io

Neural network, API on top of Theano and TensorFlow

Wikipedia article Convolutional Neural Network

Early (2005) paper on benefits of GPUs for machine learning

Wikipedia on ReLU (rectifier linear units), their cousins and motivations. f(x) = max(0,x) is found to be a helpful kind of activation function.  Related to softmax f(x) = ln(1 + e^x) and its derivative the logistic function (1/(1+e^-x)).

Wikipedia on Cross Entropy -\sum (p(x) log q(x) ) minimized by q for fixed p is easily related to Kullback-Leibler divergence

Deep Learning in Neural Networks: An Overview by J ̈ urgen Schmidhuber

Terry Tao on Shannon Entropy and its analogies to set functions

Notes on reading MacKay’s book

Neural network models (neural networks) are inspired by but not faithful models of brains.  The are interesting because  brains are interesting,  neural networks can do learning and pattern recognition,  and neural networks are complex and adaptive.

How biological memories differ from computer memories: the former are associative (content addressable), error-tolerant and robust against local failure, distributed.

Artificial neural networks: “parallel distributed computational systems consisting of many interacting simple elements.”

Terms: Architecture. Weights of connections.  Neurons have activities. Activity rules, usually dependent of weights, affect short-term dynamics. Learning rule is for changing the weights, can depend on activities and /or input of target activity values from a teacher; usually over longer time scale than that for the dynamics for changing the activities. Activity and learning rules may be invented, or derived from objective functions. Supervised vs unsupervised.

Chapter 39 covers in precise terms a range of deterministic and stochastic activation functions for the single neuron classifier, training, back-propagations, gradient-descent, batch vs on-line learning, and regularization..suggest to study this.  Going on: Ch. 40 is on single neuron capacity.  Ch 41 is on learning as inference, which is in terms of the Bayesian view of probability. It may clear up mysteries about the learning of probability distributions and a probability distribution as a classifier or predictor.  It explains why log probability is a good error function to minimize.

Bishop’s book Pattern Recognition and Machine Learning begins, after an intro to least square error linear regression, with how it is seen as choosing a normal distribution that maximizes the likelihood that the data given to be interpolated had been drawn from that distribution. This accounts for the minimization of the sum of squares of errors and is further explained in later chapters.

Deep Learning Codes

cuda-convnet2 Alexander Krizhevsky’s git repository github.com/akrizhevsky/cuda-convnet2

sdc’s fork https://github.com/chaikens/cuda-convnet2 with updates to build locally

Alex Krizhevsky’s Toronto web site

https://www.cs.toronto.edu/~kriz/imagenet_classification_with_deep_convolutional.pdf

Deep Learning Links

https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html

GPU Enhancement

GPU APIS

CUDA: NVIDIAs for their GPUs; C/C++ API

cuDNN: NVIDIA API on top of CUDA for tensor data types and machine learning computations.  NVIDIA’s larger sample code is for the test phase of LeNet-5, the handwritten digit classifier reported in Yann LeCun, et. al. Proc. IEEE Nov. 1998 “Gradient-Based Learning Applied to Document Recognition” The network used was implemented and trained using Caffe.

From sample code: “Training LeNet on MNIST with Caffe” tutorial, located
at http://caffe.berkeleyvision.org/

OpenCL: Open for general GPUs, many platforms of multithreaded CPUs, FPGAs and DSPs

CUDA and OpenCL provide C-like languages for coding kernels (the code that is run in parallel).

Keras is a Python API built on TensorFlow or Theano

TensorFlow is a Python API that can use CUDA.   When it uses CUDA, it does it through NVIDIA’s cuDNN.

Theano is a Python API that can use CUDA (through NVIDIA’s cuDNN like TensorFlow does) and, with “minimal support”, OpenCL