This Machine Learning Interview Questions and Answers guide covers the most common beginner-friendly Machine Learning topics—supervised vs. unsupervised learning, model evaluation, overfitting, bias–variance, regularization, and more. Each answer is short and practical so you can revise fast before an interview.

25 Machine Learning Interview Questions and Answers for Freshers

Que 1. What is Machine Learning?

Answer: Machine Learning is a subset of AI where algorithms learn patterns from data to make predictions or decisions without being explicitly programmed for every rule.

Que 2. What is the difference between supervised and unsupervised learning?

Answer:
Supervised: Trains on labeled data (features + target). Used for classification/regression.
Unsupervised: Trains on unlabeled data to discover structure (e.g., clustering, dimensionality reduction).

Que 3. What is classification vs. regression?

Answer:
Classification: Predicts discrete categories (e.g., spam/not spam).
Regression: Predicts continuous values (e.g., house price).

Que 4. What is overfitting?

Answer: Overfitting occurs when a model learns noise/specific patterns in training data and performs poorly on new data. It generalizes badly.

Que 5. How can you reduce overfitting?

Answer:
Use more data or data augmentation,
Apply regularization (L1/L2, dropout),
Simplify the model,
Use cross-validation and early stopping.

Que 6. What is underfitting?

Answer: Underfitting happens when the model is too simple to capture underlying patterns, leading to poor performance on both training and test data.

Que 7. Explain bias–variance trade-off.

Answer: Bias is error from overly simple assumptions; variance is error from excessive sensitivity to data. Good models balance low bias and low variance to generalize well.

Que 8. What is a training set, validation set, and test set?

Answer:
Training: For learning model parameters.
Validation: For tuning hyperparameters and model selection.
Test: For final, unbiased performance evaluation.

Que 9. What is cross-validation?

Answer: A technique (e.g., k-fold) that splits data into multiple folds to train and validate repeatedly, providing a more reliable estimate of model performance.

Que 10. What is a confusion matrix?

Answer: A table for classification that shows true/false positives and negatives, helping derive metrics like accuracy, precision, recall, and F1-score.

Que 11. Define precision, recall, and F1-score.

Answer:
Precision: TP / (TP + FP) — how many predicted positives are correct.
Recall: TP / (TP + FN) — how many actual positives are found.
F1: Harmonic mean of precision and recall; balances both.

Que 12. What is accuracy and when can it be misleading?

Answer: Accuracy = (correct predictions / total). It can be misleading on imbalanced datasets where one class dominates.

Que 13. What is ROC curve and AUC?

Answer: ROC plots TPR vs. FPR at various thresholds. AUC measures area under the ROC; higher AUC indicates better separability.

Que 14. What is regularization (L1/L2)?

Answer: Regularization adds a penalty to the loss to discourage overly complex models. L1 (Lasso) encourages sparsity; L2 (Ridge) shrinks weights smoothly.

Que 15. What is feature scaling and why is it needed?

Answer: Scaling (standardization/normalization) puts features on similar ranges so gradient-based methods and distance-based models work effectively.

Que 16. What is gradient descent?

Answer: An optimization algorithm that iteratively updates parameters in the direction of the negative gradient of the loss to find minima.

Que 17. What are hyperparameters vs. parameters?

Answer:
Parameters: Learned from data (e.g., weights).
Hyperparameters: Set before training (e.g., learning rate, depth, C, k) and tuned via validation.

Que 18. What is k-Nearest Neighbors (k-NN)?

Answer: A simple algorithm that classifies a point based on the majority label among its k closest neighbors (distance-based). Works for regression too (average of neighbors).

Que 19. What is linear regression and its assumptions?

Answer: Predicts a continuous target as a linear combination of features. Common assumptions: linearity, independent errors, homoscedasticity, and normality of errors (approximate).

Que 20. What is logistic regression?

Answer: A classification algorithm modeling the probability of a class using the logistic (sigmoid) function; outputs probabilities and class labels via a threshold.

Que 21. What is a decision tree and its pros/cons?

Answer:
Decision Tree: Splits data by feature thresholds to form a tree of decisions.
Pros: Interpretable, handles nonlinearity, little preprocessing.
Cons: Prone to overfitting; unstable to small data changes.

Que 22. What are ensemble methods (Bagging, Random Forest, Boosting)?

Answer:
Bagging: Train models on bootstrapped samples and average (reduces variance).
Random Forest: Bagging + random feature selection at splits.
Boosting: Sequentially trains weak learners to correct previous errors (e.g., AdaBoost, XGBoost).

Que 23. What is PCA and when is it used?

Answer: Principal Component Analysis reduces dimensionality by projecting data onto directions with maximum variance, helping visualization, noise reduction, and speed.

Que 24. What is one-hot encoding vs. label encoding?

Answer:
One-hot: Creates binary columns per category (no order).
Label: Assigns integer IDs (implies order; suitable for tree models, but may mislead linear models).

Que 25. What is the difference between epochs, batches, and batch size?

Answer:
Epoch: One full pass over the training data.
Batch: A subset processed before updating parameters.
Batch size: Number of samples in one batch.

You can also Download the PDF from here:

Machine Learning Interview Questions PDF