Skip to main content

Posts

Statistical Distributions

Different types of distributions. Bernoulli distribution : A Bernoulli distribution is a discrete probability distribution with two possible outcomes, usually called "success" and "failure." The probability of success is denoted by p and the probability of failure is denoted by q . The Bernoulli distribution can be used to model a variety of events, such as whether a coin toss results in heads or tails, whether a student passes an exam, or whether a customer makes a purchase. Uniform distribution : A uniform distribution is a continuous probability distribution that assigns equal probability to all values within a specified range. The uniform distribution can be used to model a variety of events, such as the roll of a die, the draw of a card from a deck, or the time it takes to complete a task. Binomial distribution : A binomial distribution is a discrete probability distribution that describes the number of successes in a sequence of n independent trials, eac...

Gini Index & Information Gain in Machine Learning

What is the Gini index? The Gini index is a measure of impurity in a set of data. It is calculated by summing the squared probabilities of each class. A lower Gini index indicates a more pure set of data. What is information gain? Information gain is a measure of how much information is gained by splitting a set of data on a particular feature. It is calculated by comparing the entropy of the original set of data to the entropy of the two child sets. A higher information gain indicates that the feature is more effective at splitting the data. What is impurity? Impurity is a measure of how mixed up the classes are in a set of data. A more impure set of data will have a higher Gini index. How are Gini index and information gain related? Gini index and information gain are both measures of impurity, but they are calculated differently. Gini index is calculated by summing the squared probabilities of each class, while information gain is calculated by comparing the entropy of the original ...