lambda-ml.decision-tree

Decision tree learning using the Classification and Regression Trees (CART) algorithm.

Example usage;

(def data [[0 0 0] [0 1 1] [1 0 1] [1 1 0]])
(def fit
  (let [min-split 2
        min-leaf 1
        max-features 2]
    (-> (make-classification-tree gini-impurity min-split min-leaf max-features)
        (decision-tree-fit data))))
(decision-tree-predict fit (map butlast data))
;;=> (0 1 1 0)

best-splitter

(best-splitter model x y)

Returns the splitter for the given data that minimizes a weighted cost function, or returns nil if no splitter exists.

view source

categorical-partitions

(categorical-partitions vals)

Given a seq of k distinct values, returns the 2^{k-1}-1 possible binary partitions of the values into sets.

view source

classification-weighted-cost

(classification-weighted-cost y1 y2 f g)

view source

decision-tree-fit

(decision-tree-fit model data)(decision-tree-fit model x y)

Fits a decision tree to the given training data.

view source

decision-tree-predict

(decision-tree-predict model x)

Predicts the values of example data using a decision tree.

view source

gini-impurity

(gini-impurity labels)

Returns the Gini impurity of a seq of labels.

view source

make-classification-tree

(make-classification-tree cost min-split min-leaf max-features)

Returns a classification decision tree model using the given cost function.

view source

make-regression-tree

(make-regression-tree cost min-split min-leaf max-features)

Returns a regression decision tree model using the given cost function.

view source

mean-squared-error

(mean-squared-error labels predictions)

Returns the mean squared error for a seq of predictions.

view source

numeric-partitions

(numeric-partitions vals)

Given a seq of k distinct numeric values, returns k-1 possible binary partitions of the values by taking the average of consecutive elements in the sorted seq of values.

view source

print-decision-tree

(print-decision-tree model)

Prints information about a given decision tree.

view source

regression-weighted-cost

(regression-weighted-cost y1 y2 f g)

view source

splitters

(splitters x i)

Returns a seq of all possible splitters for feature i. A splitter is a predicate function that evaluates to true if an example belongs in the left subtree, or false if an example belongs in the right subtree, based on the splitting criterion.

view source

Generated by Codox

Lambda-ml 0.1.1

Project

Namespaces

Public Vars

lambda-ml.decision-tree

best-splitter

categorical-partitions

classification-weighted-cost

decision-tree-fit

decision-tree-predict

gini-impurity

make-classification-tree

make-regression-tree

mean-squared-error

numeric-partitions

print-decision-tree

regression-weighted-cost

splitters