Here is some code I wrote to calculate the probability of labels with respect to some observed features using a Naive Bayes classifier. This is intended to compute the Naive Bayes formula without smoothing, and is intended to calculate the actual probabilities, so do use the usually omitted denominator. The problem I have is that for the example (below) the probability of a "good" label is > 1. (1.30612245) Can anyone help me understand what thats about? Is this a byproduct of the Naive assumption?
package NaiveBayes;
use Moose;
has class_counts => (is => 'ro', isa => 'HashRef[Int]', default => sub {{}});
has class_feature_counts => (is => 'ro', isa => 'HashRef[HashRef[HashRef[Num]]]', default => sub {{}});
has feature_counts => (is => 'ro', isa => 'HashRef[HashRef[Num]]', default => sub {{}});
has total_observations => (is => 'rw', isa => 'Num');
sub insert {
my( $self, $class, $data ) = @_;
$self->class_counts->{$class}++;
$self->total_observations( ($self->total_observations||0) + 1 );
for( keys %$data ){
$self->feature_counts->{$_}->{$data->{$_}}++;
$self->class_feature_counts->{$_}->{$class}->{$data->{$_}}++;
}
return $self;
}
sub classify {
my( $self, $data ) = @_;
my %probabilities;
my $feature_probability = 1;
for my $class( keys %{ $self->class_counts } ) {
my $class_count = $self->class_counts->{$class};
my $class_probability = $class_count / $self->total_observations;
my($feature_probability, $conditional_probability) = (1) x 2;
my( @feature_probabilities, @conditional_probabilities );
for( keys %$data ){
my $feature_count = $self->feature_counts->{$_}->{$data->{$_}};
my $class_feature_count = $self->class_feature_counts->{$_}->{$class}->{$data->{$_}} || 0;
next unless $feature_count;
$feature_probability *= $feature_count / $self->total_observations;
$conditional_probability *= $class_feature_count / $class_count;
}
$probabilities{$class} = $class_probability * $conditional_probability / $feature_probability;
}
return %probabilities;
}
__PACKAGE__->meta->make_immutable;
1;
Example:
#!/usr/bin/env perl
use Moose;
use NaiveBayes;
my $nb = NaiveBayes->new;
$nb->insert('good' , {browser => 'chrome' ,host => 'yahoo' ,country => 'us'});
$nb->insert('bad' , {browser => 'chrome' ,host => 'slashdot' ,country => 'us'});
$nb->insert('good' , {browser => 'chrome' ,host => 'slashdot' ,country => 'uk'});
$nb->insert('good' , {browser => 'explorer' ,host => 'google' ,country => 'us'});
$nb->insert('good' , {browser => 'explorer' ,host => 'slashdot' ,country => 'ca'});
$nb->insert('good' , {browser => 'opera' ,host => 'google' ,country => 'ca'});
$nb->insert('good' , {browser => 'firefox' ,host => '4chan' ,country => 'us'});
$nb->insert('good' , {browser => 'opera' ,host => '4chan' ,country => 'ca'});
my %classes = $nb->classify({browser => 'opera', host => '4chan', country =>'uk'});
my @classes = sort { $classes{$a} <=> $classes{$b} } keys %classes;
for( @classes ){
printf( "%-20s : %5.8f\n", $_, $classes{$_} );
}
Prints:
bad : 0.00000000
good : 1.30612245
Im less worried about the 0 probability, but more that the "probability" of good > 1. I believe this is the implementation of the classic Naive Bayes definition.
p(C│F_1 ...F_n )=(p(C)p(F_1 |C)...p(F_n |C))/(p(F_1)...p(F_n))
How can this be > 1?