Milk wraps libsvm in a Pythonic way (the models learned have weight arrays that are accessible from Python directly, the models are pickle()able, you can pass any Python function as a kernel,....)
Milk focuses on supervised classification and on enabling medium scale learning (defined as data that barely fits in main memory).
Milk also supports k-means clustering with an implementation that is careful not to use too much memory (if your dataset fits into memory, milk can cluster it).
Milk does not have its own file format or in-memory format, which I consider a feature as it works on numpy arrays directly (or anything that is convertible to a numpy-array) without forcing you to copy memory around. For SVMs, you can even just use any datatype if you have your own kernel function.
Here are some key features of "Milk":
· Pythonic interface to libSVM.
· Stepwise Discriminant Analysis for feature selection.
· K-means clustering. A simple implementation but it works well with very large datasets.
· Models can be pickle()d and unpickle()d.
Requirements:
· Python
What`s New in This Release: [ read full changelog ]
· no scipy.weave dependency
· flatter namespace
· faster kmeans
· affinity propagation (borrowed from scikits-learn & slightly improved)
· pdist()