lambda-ml.decision-tree
Decision tree learning using the Classification and Regression Trees (CART) algorithm.
Example usage;
(def data [[0 0 0] [0 1 1] [1 0 1] [1 1 0]])
(def fit
(let [min-split 2
min-leaf 1
max-features 2]
(-> (make-classification-tree gini-impurity min-split min-leaf max-features)
(decision-tree-fit data))))
(decision-tree-predict fit (map butlast data))
;;=> (0 1 1 0)
best-splitter
(best-splitter model x y)
Returns the splitter for the given data that minimizes a weighted cost function, or returns nil if no splitter exists.
categorical-partitions
(categorical-partitions vals)
Given a seq of k distinct values, returns the 2^{k-1}-1 possible binary partitions of the values into sets.
decision-tree-fit
(decision-tree-fit model data)
(decision-tree-fit model x y)
Fits a decision tree to the given training data.
decision-tree-predict
(decision-tree-predict model x)
Predicts the values of example data using a decision tree.
make-classification-tree
(make-classification-tree cost min-split min-leaf max-features)
Returns a classification decision tree model using the given cost function.
make-regression-tree
(make-regression-tree cost min-split min-leaf max-features)
Returns a regression decision tree model using the given cost function.
mean-squared-error
(mean-squared-error labels predictions)
Returns the mean squared error for a seq of predictions.
numeric-partitions
(numeric-partitions vals)
Given a seq of k distinct numeric values, returns k-1 possible binary partitions of the values by taking the average of consecutive elements in the sorted seq of values.
print-decision-tree
(print-decision-tree model)
Prints information about a given decision tree.
splitters
(splitters x i)
Returns a seq of all possible splitters for feature i. A splitter is a predicate function that evaluates to true if an example belongs in the left subtree, or false if an example belongs in the right subtree, based on the splitting criterion.