• Other names AB testing, Confirmative Analysis, and significance testing.
  • Generally, population parameters (standard deviation, maximum, minimum, and so on) are unknown in real-time.
  • However, we do have hypotheses about what the true values are.
  • Hypothesis testing is a bunch of methods to evaluate the hypothesis about the population parameter based on the available sample parameters.
  • There are 2 terms in the hypothesis, they are null hypothesis and alternate hypothesis.

Null Hypothesis (H0):

  • A general statement about the population parameters which assumed to be true unless strong proof for the opposite statement.
  • The default statement is that there is no difference between the measured…

t-SNE means t-distribution Stochastic Neighborhood Embedding

Dimensionality reduction

  • 1D, 2D, and 3D data can be visualized. And it’s not always possible to work with a dataset having less than or equal to 3 dimensions in the field of data science. We may end up in a situation to work with higher dimensional data. For a data science professional, it is necessary to visualize and get insights about the working data to do a better job. To mitigate this, dimensionality reduction techniques have been evolved.
  • Another most popular use case of the dimensionality reduction technique is to reduce the computational complexity while training…

Experiment →uncertain situations, which could have multiple outcomes. A coin toss is an experiment.

Outcome → result of a single trial. So, if head lands, the outcome of coin toss experiment is “Heads”

Event → one or more outcomes from an experiment. “Tails” is one of the possible events for this experiment.

Basic Probability

Chance of something happening, but in the academic term “likelihood of an event or sequence of events occurring”. for example

  • Tossing a coin
  • Rolling a dice

Conditional Probability

Probability of an event occurring given that another event has already occurred. for example

  • Picking 3 blue balls from a box has…

Measure of Central Tendency

  • Measure of Spread
  • Dependence

Measure of Central Tendency:

Mean → Average of a set of data points.

Median → Middle element of data points which is sorted in ascending order.

Mode → particular data point which appeared maximum number of times from a set of data points.

Measure of Spread:

Standard Deviation (SD) → Average distance between mean and each data points.

Variance → Measure of how far each value in the data set is from the mean (Square of SD).

Range → Maximum value minus Minimum value from a set of data points.

Percentile → Representation of position of a value in…

Ramakrishnan Thiyagu

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store