Top 10 machine learning interview qusetion

By admin
August 13, 2021
0 Comment

1.What is the purpose of machine learning?

Making our lives easier is the simplest answer.

A large number of systems used hardcoded rules of if-then-else decisions to process data or adjust user input in the early days of intelligent applications.The job of a spam filter is to move appropriate incoming email messages to the spam folder.Data can learn and identify patterns with the help of machine learning algorithms, however.

Instead of writing new rules for each problem, we can use the same workflow with a different dataset in machine learning. Alan Turing asked the question “Can machines think?” in his 1950 paper, “Computing Machinery and Intelligence.”The full text of the paper can be found here. The “Imitation Game” is described in the paper. Another human, and human acting as a judge. The purpose of a computer is to convince the judge that it is a real person. Both participants are invited to speak by the judge.

Meanwhile, the judge must determine which of their responses came from the computer.

Because of this, the computer would have won if a judge could not tell the difference.

Artificial intelligence competitions continue to be held every year in honor of this test.

Goal: persuade judges they are talking to a real person, not an automated computer chatbot.

2.How do machine learning algorithms differ?

Machine learning algorithms come in a variety of shapes, sizes, and types.

In general, they fall into the following categories:

The criteria in the diagram below are not mutually exclusive; we can combine them in any way we like.

3.How does supervision work?

It’s a machine-learning algorithm that uses labelled training data to infer a function.

Training examples make up the training data.

01, for instance

Knowing a person’s height and weight allows you to determine their gender. Listed below are some of the most popular algorithms for supervised learning. Rationality Analysis Using Support Vector Machines. There are two types of decision trees: naive Bayes decision trees and neural networks.

Read: Machine Learning Jobs for Fresher’s in 2021

4. In what respects does unsupervised learning differ from supervised learning?

This type of algorithm searches for patterns in a given set of data unsupervised.

As a result, we cannot make any predictions about a dependent variable or a label in this situation.

Algorithms for Unsupervised Learning: Anomaly detection, clustering, neural networks, and latent variable models are all part of the process.

Example:

“Collar style and V neck style” and “crew neck style” are examples of T-shirt clustering.

The naive part of a Naive Bayes equation is defined as follows:

Because it relies on the Bayes theorem, which states that all attributes are independent of each other, it is called a naive method of supervised learning. Given class variable y and dependent vectors x1 to xn, Bayes’ theorem states the following relationship:

This is equivalent to P(yi|x1,…, xn) = P(yi) (x1,…, xn)

Assuming that each xi is independent, this relationship can be simplified to:

In other words, P(x1 |y1 |x2, etc.) = P(xi |yi).

Given that P(x1,…, xn) is a constant, we can classify it using the following rules:

The equation P(yi|x1,…, xn) = P(y) ni=1

P(xi | yi)P(x1,…,xn) and we can also estimate P(yi) and P(yi | xi) by using Maximum A Posteriori (MAP) estimation.

It is written as P(yi|x1…, xn)

If P(xi | yi) = 1, then P(yi) = 1.

the maximum value of the argument y (yi)

ni=1P(xi | yi)

Naive Bayes classifiers differ primarily in their assumptions about P(yi | xidistribution, )’s which can be Bernoulli, binomial, Gaussian, and so forth.

5.Why should I care about PCA? What is its purpose?

PCA is the most commonly used method for dimension reduction.PCA measures the variation in each variable in this case (or column in the table).The figure below illustrates what happens if there is little variation in the variable: This makes it easier to visualize the dataset.In finance, neuroscience, and pharmacology, PCA is employed.

In preprocessing, it is particularly useful when there are linear correlations between the features to be analysed.

6. Describe in detail the SVM algorithm?

It is capable of performing linear or non-linear classification, regression, and even outlier detection. Assume we have a set of data points that each belong to one of two classes.Data points are viewed as p-dimensional vectors in SVM, and we wanted to know whether we could separate them using a (p-1)-dimensional hyperplane. The answer is yes.This type of classifier is known as a linear one.

Hyperplanes classify data in a variety of ways.Determine which hyperplane represents the greatest separation or margin between the two classes.

Maximum-margin hyperplanes exist, and their linear classifiers are known as maximum margin classifiers.

Data in H3 can be divided most effectively by using hyperplanes.

It consists of the following: data (x1, y1); information (x2, y2); and information (x3,…, y3); and yi is either 1 or -1.

The set of points satisfying the following equation is the hyperplane H3 equation.

When xb = 0, we have the following equation:

A hyperplane has a normal vector w, which indicates its orientation.

This value is determined by the parameter b||w||, which is the hyperplane’s offset along the normal vector w

The hyperplane of 1 corresponds to xi for each i.

Basically, xisatisfies the following criteria:

It can be written either as W.xi-b- 1 or as WXIB-1.

7.What is Support Vector Machine (SVM)?

Support vectors are used in SVM.As the name suggests, a Support Vector Machine (SVM) is an algorithm for fitting a line (or plane or hyperplane) between different classes that maximises distance between that line and each class’s point(s).

A robust separation of classes is sought in this way.As shown in the figure below, the Support Vectors are the points along the edge of the dividing hyperplane.

8.In SVM, what are the different kernels?

SVM has six different types of kernels:

When data is linearly separable, the kernel is linear.When you have discrete data that lacks a natural notion of smoothness, you have a polynomial kernel.This type of decision boundary is superior to the linear kernel in terms of separating two classes.As an activation function for neural networks, the sigmoid kernel is used.

10.What is Cross-Validation and how does it work?

Data is split into three parts: training, testing, and validation, using the cross-validation method.

There are k subsets of data, and the model has been trained on k-1 of them.The last subset is reserved for testing.This is repeated for each subset. k-fold crossvalidation is used in this case.The final score is calculated by averaging the scores from all k-folds.Cross-validation

When a machine learns based on bias, what does that mean?

A data bias tells us that data is inconsistent.Diverse causes for the inconsistency may exist.

Example: Amazon builds an engine to speed up the hiring process. They send 100 resumes to the machine which then selects 5 to hire.

In order to remove this bias, the software was modified after the company realised that the software was not producing gender-neutral results.

Consider the following question: What’s the weather forecast for tomorrow?

Also, Read: 10 Mandatory Skills to Become an AI & ML Engineer