Data Analytics Interview Questions and Answers | Common & Technical-Related asked in Top IT MNCs (like TCS, Infosys, Accenture, Zoho, Wipro, IBM, and Deloitte)
Basic Level (1–15)
1. What is Data Analytics?
Data Analytics is the process of examining datasets to conclude the information they
contain, often with the help of specialised tools and software.
2. What are the main types of Data Analytics?
- Descriptive: What happened
- Diagnostic: Why it happened
- Predictive: What will happen
- Prescriptive: What should be done
3. What are the key steps in a data analysis project?
- Define objective
- Collect data
- Clean and preprocess data
- Analyse data
- Visualise results
- Conclude and make recommendations
4. What is the difference between structured and unstructured data?
- Structured: Organised in rows/columns (e.g., SQL tables)
- Unstructured: Raw data like images, emails, videos, and social media text
5. What is data cleaning, and why is it important?
It’s the process of correcting or removing inaccurate, incomplete, or irrelevant data to
ensure quality analysis.
6. What is the role of SQL in data analytics?
SQL is used to query, filter, aggregate, and manage structured data stored in databases.
7. Explain the difference between INNER JOIN and LEFT JOIN in SQL.
- INNER JOIN: Returns only matching records from both tables
- LEFT JOIN: Returns all records from the left table and matching ones from the right
8. How do you find duplicate records in SQL?
SELECT column_name, COUNT(*)
FROM table_name
GROUP BY column_name
HAVING COUNT(*) > 1;
9. What libraries in Python are used for data analysis?
- Pandas – Data manipulation
- NumPy – Numerical operations
- Matplotlib / Seaborn – Visualisation
- Scikit-learn – Machine learning
10. What is a DataFrame in Pandas?
A two-dimensional, labelled data structure similar to an Excel sheet used to handle and
analyse data in Python.
11. How do you handle missing data in Pandas?
Indexes are performance-boosting structures that allow faster retrieval of rows from a table.
- dropna() → Remove missing rows
- fillna() → Fill missing values with mean/median/mode
12. What is the difference between variance and standard deviation?
Variance measures how far data points spread from the mean; standard deviation is the
square root of variance.
13. What is correlation in data analytics?
Correlation measures the relationship between two variables. Value ranges from -1 to
+1.
14. What is normalisation?
It’s the process of scaling numeric data into a specific range (e.g., 0–1) for model
efficiency.
15. What is outlier detection, and how do you handle it?
Outliers are extreme data points. They can be handled using methods like IQR
(Interquartile Range) or z-score analysis.