Genairic (Ruby)

A pure-Ruby module that provides purpose-independent access to some artificial intelligence (AI) and machine-learning algorithms. The code for this project is hosted on GitHub.

Classes

Naïve-Bayes Classifier

This class instantiates a classifier based on programmer-defined features and classes as well as a function for discretizing objects. The purpose of the discretizer is produce a hash of true or false values for each feature when given a single set item as an argument.

Features
Features are generally a simple array of labels that will describe different true or false aspects of a set item.

features = [
    :feature_one,
    :feature_two,
    :feature_three
]

Classes
Features are generally a simple array of labels that the classifier may assign to a set after training.

classes = [
    :class_one,
    :class_two,
    :class_three
]

The Discretizer
The discretizer is a Proc instance that takes a set item as an argument and produces a hash with a true or false value for every feature.

discretizer = Proc.new { |item|

    # tests and so on

    ...

    {
        :feature_one    => feature_one_bool,
        :feature_two    => feature_two_bool,
        :feature_three  => feature_three_bool
    }
}

Instantiation

classifier = Genairic::NaiveBayesClassifier.new(
    :features       => features,
    :classes        => classes,
    :discretizer    => discretizer
)

Training the Classifier
Training is done one item at a time, providing an input and a given classification.

classifier.train(item, :class_one);

The discretizer provided earlier will be run on the item, and the results will be tallied.

Classification
The classifying method will return a label for the most likely class.

classifier.classify(item);  # == :class_one, etc

Rebuilding of probabilities will happen automatically as part of this call if it is necessary (i.e. if a training item has been run since the last classification).

Running the Tests

All you need to do to start the test suite is run the command:

ruby test/run.rb

Comments are closed.