Showing posts with label data science. Show all posts
Showing posts with label data science. Show all posts

Tuesday

Calculating Vaccine Effectiveness with Bayes' Theorem


We can use Bayes' Theorem to estimate the probability of someone not having an effect (meaning they get infected after vaccination) for both Covishield and Covaxin, considering a population of 1.4 billion individuals.


Assumptions:


We assume equal distribution of both vaccines in the population (700 million each).


We focus on individual protection probabilities, not overall disease prevalence.


Calculations:


Covishield:


Prior Probability (P(Effect)): Assume 10% of the vaccinated population gets infected (no effect), making P(Effect) = 0.1.


Likelihood (P(No Effect|Effect)): This represents the probability of someone not being infected given they received Covishield. Given its 90% effectiveness, P(No Effect|Effect) = 0.9.


Marginal Probability (P(No Effect)): This needs calculation, considering both vaccinated and unvaccinated scenarios. P(No Effect) = P(No Effect|Vaccinated) * P(Vaccinated) + P(No Effect|Unvaccinated) * P(Unvaccinated) Assuming 50% effectiveness for unvaccinated individuals and equal vaccination rates, P(No Effect) = (0.9  0.5) + (0.5  0.5) = 0.7.


Now, applying Bayes' Theorem:


P(Effect|No Effect) = (P(No Effect|Effect) * P(Effect)) / P(No Effect) * P(Effect|No Effect) = (0.9  0.1) / 0.7 ≈ 0.129


Therefore, about 12.9% of people vaccinated with Covishield could still get infected, meaning 700 million * 0.129 ≈ 90.3 million individuals might not have the desired effect from the vaccine.


Covaxin:


Similar calculations for Covaxin, with its 78-81% effectiveness range, would yield a range of 19.5% - 22.2% for the "no effect" probability. This translates to potentially 136.5 million - 155.4 million individuals not fully protected by Covaxin in the given population.


Important Note:


These are hypothetical calculations based on limited assumptions. Real-world effectiveness can vary depending on individual factors, virus strains, and vaccination coverage.


Conclusion:


Both Covishield and Covaxin offer significant protection against COVID-19, but they are not 100% effective. A significant portion of the vaccinated population might still have some risk of infection. Vaccination remains crucial for reducing disease spread and severe outcomes, but additional precautions like hand hygiene and masks might be advisable. 

Thursday

Activation Function in Machine Learning

 


In machine learning, activation functions are crucial components of artificial neural networks. They introduce non-linearity into the network, enabling it to learn and represent complex patterns in data. Here's a breakdown of the concept and examples of common activation functions:

1. What is an Activation Function?

  • Purpose: Introduces non-linearity into a neural network, allowing it to model complex relationships and make better predictions.
  • Position: Located within each neuron of a neural network, applied to the weighted sum of inputs before passing the output to the next layer.

2. Common Activation Functions and Examples:

a. Sigmoid:

  • Output: S-shaped curve between 0 and 1.
  • Use Cases: Binary classification, historical use in early neural networks.
  • Example: Predicting if an image contains a cat (output close to 1) or not (output close to 0).

b. Tanh (Hyperbolic Tangent):

  • Output: S-shaped curve between -1 and 1.
  • Use Cases: Similar to sigmoid, often preferred for its centred output.
  • Example: Sentiment analysis, classifying text as positive (close to 1), neutral (around 0), or negative (close to -1).

c. ReLU (Rectified Linear Unit):

  • Output: 0 for negative inputs, x for positive inputs (x = input value).
  • Use Cases: Very popular in deep learning, helps mitigate the vanishing gradient problem.
  • Example: Image recognition, detecting edges and features in images.

d. Leaky ReLU:

  • Output: Small, non-zero slope for negative inputs, x for positive inputs.
  • Use Cases: Variation of ReLU, addresses potential "dying ReLU" issue.
  • Example: Natural language processing, capturing subtle relationships in text.

e. Softmax:

  • Output: Probability distribution over multiple classes (sums to 1).
  • Use Cases: Multi-class classification, is often the final layer in multi-class neural networks.
  • Example: Image classification, assigning probabilities to each possible object in an image.

f. PReLU (Parametric ReLU):

  • Concept: Similar to ReLU, sets negative inputs to 0 but introduces a learnable parameter (α) that allows some negative values to have a small positive slope.
  • Benefits: Addresses the "dying ReLU" issue where neurons become inactive due to always outputting 0 for negative inputs.
  • Drawbacks: Increases model complexity due to the additional parameter to learn.
  • Example: Speech recognition tasks, where capturing subtle variations in audio tones might be crucial.

g. SELU (Scaled Exponential Linear Unit):

  • Concept: Combines Leaky ReLU with an automatic scaling factor that self-normalizes the activations, reducing the need for manual normalization techniques.
  • Benefits: Improves gradient flow and convergence speed, prevents vanishing gradients, and helps with weight initialization.
  • Drawbacks: Slightly more computationally expensive than Leaky ReLU due to the exponential calculation.
  • Example: Computer vision tasks where consistent and stable activations are important, like image classification or object detection.

h. SoftPlus:

  • Concept: Smoothly transforms negative inputs to 0 using a log function, avoiding the harsh cutoff of ReLU.
  • Benefits: More continuous and differentiable than ReLU, can be good for preventing vanishing gradients and offers smoother outputs for regression tasks.
  • Drawbacks: Can saturate for large positive inputs, limiting expressiveness in some situations.
  • Example: Regression tasks where predicting smooth outputs with continuous changes is important, like stock price prediction or demand forecasting.

The formula for the above-mentioned activation functions

1. Sigmoid:

  • Formula: f(x) = 1 / (1 + exp(-x))
  • Output: S-shaped curve between 0 and 1, with a steep transition around 0.
  • Use Cases: Early neural networks, binary classification, logistic regression.
  • Pros: Smooth and differentiable, provides probabilities in binary classification.
  • Cons: Suffers from vanishing gradients in deeper networks, computationally expensive.

2. Tanh (Hyperbolic Tangent):

  • Formula: f(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
  • Output: S-shaped curve between -1 and 1, centered around 0.
  • Use Cases: Similar to sigmoid, often preferred for its centred output.
  • Pros: More balanced activation range than sigmoid, avoids saturation at extremes.
  • Cons: Still susceptible to vanishing gradients in deep networks, slightly computationally expensive.

3. ReLU (Rectified Linear Unit):

  • Formula: f(x) = max(0, x)
  • Output: Clips negative inputs to 0, outputs directly positive values.
  • Use Cases: Popular choice in deep learning, image recognition, and natural language processing.
  • Pros: Solves the vanishing gradient problem, is computationally efficient, and promotes sparsity.
  • Cons: "Dying ReLU" issue if negative inputs dominate, insensitive to small changes in input values.

4. Leaky ReLU:

  • Formula: f(x) = max(α * x, x) for some small α > 0
  • Output: Similar to ReLU, but allows a small positive slope for negative inputs.
  • Use Cases: Addresses ReLU's "dying" issue, natural language processing, and audio synthesis.
  • Pros: Combines benefits of ReLU with slight negative activation, helps prevent dying neurons.
  • Cons: Introduces another hyperparameter to tune (α), slightly less computationally efficient than ReLU.

5. Softmax:

  • Formula: f_i(x) = exp(x_i) / sum(exp(x_j)) for all i and j
  • Output: Probability distribution over multiple classes (sums to 1).
  • Use Cases: Multi-class classification, final layer in multi-class neural networks.
  • Pros: Provides normalized probabilities for each class, and allows for confidence estimation.
  • Cons: Sensitive to scale changes in inputs, computationally expensive compared to other options.

6. PReLU (Parametric ReLU):

  • Formula: f(x) = max(αx, x)
  • Explanation:
    • For x ≥ 0, the output is simply x (same as ReLU).
    • For x < 0, the output is αx, where α is a learnable parameter that adjusts the slope of negative values.
    • The parameter α is typically initialized around 0.01 and learned during training, allowing the model to determine the optimal slope for negative inputs.

7. SELU (Scaled Exponential Linear Unit):

  • Formula: f(x) = lambda * x if x >= 0 else lambda * alpha * (exp(x) - 1)
  • Explanation:
    • For x ≥ 0, the output is lambda * x, where lambda is a scaling factor (usually around 1.0507).
    • For x < 0, the output is lambda * alpha * (exp(x) - 1), where alpha is a fixed parameter (usually 1.67326).
    • The scaling and exponential terms help normalize the activations and improve gradient flow, often leading to faster and more stable training.

8. SoftPlus:

  • Formula: f(x) = ln(1 + exp(x))
  • Explanation:
    • Transforms negative inputs towards 0 using a logarithmic function, resulting in a smooth, continuous curve.
    • Provides a smooth transition between 0 and positive values, avoiding the sharp cutoff of ReLU.
    • Can be more sensitive to small changes in input values, making it suitable for tasks where continuous variations are important.

Key points to remember:

  • The choice of activation function significantly impacts a neural network's performance and training dynamics.
  • Experimenting with different activation functions and evaluating their performance on your specific task is crucial for finding the best fit.
  • Consider the problem type, network architecture, desired properties (e.g., smoothness, non-linearity, normalization), and computational cost when selecting an activation function.

Choosing the right activation function among these options depends on your specific needs. Consider factors like:

  • Problem type: Is it classification, regression, or something else?
  • Network architecture: How deep is the network, and what other activation functions are used?
  • Performance considerations: Do you prioritize faster training or better accuracy?

Experimenting with different options and evaluating their performance on your specific dataset is crucial for making an informed decision.

Azure Data Factory Transform and Enrich Activity with Databricks and Pyspark

In #azuredatafactory at #transform and #enrich part can be done automatically or manually written by #pyspark two examples below one data so...