1. What is a data analyst, and what are their primary responsibilities?
Ans:
A data analyst collects, processes, and interprets data to help businesses make informed decisions. Responsibilities include data cleaning, querying databases, creating reports, visualizing trends, and supporting data-driven strategies.
2. Explain the differences between data analysis and data science.
Ans:
- Data analysis focuses on interpreting existing data to draw insights.
- Data science includes data analysis but also involves building predictive models, machine learning, and advanced algorithms to uncover deeper insights.
3. What tools and technologies do you use for data analysis?
Ans:
Common tools include:
- SQL for querying databases
- Excel for quick analysis and reporting
- Python (Pandas, NumPy) or R for deeper analysis
- Tableau or Power BI for visualization
- ETL tools like Talend, Alteryx, or Excel Power Query for data preparation
4. What is SQL, and how is it used in data analysis?
Ans:
SQL (Structured Query Language) is used to interact with relational databases. It allows analysts to extract, filter, join, and manipulate data needed for analysis.
5. What is a primary key, and why is it important in a database?
Ans:
A primary key uniquely identifies each row in a table, ensuring data integrity and enabling reliable joins between tables.
6. What are some common data formats used in data analysis?
Ans:
- CSV (Comma-Separated Values)
- Excel (.xlsx)
- JSON (JavaScript Object Notation)
- XML
- SQL databases
- Parquet and ORC (for big data)
7. How do you define and measure data quality?
Ans:
Data quality refers to the accuracy, completeness, consistency, timeliness, and validity of data. Measured using metrics like null counts, duplicate ratios, outlier frequency, and validation rule checks.
8. Describe the role of a data analyst in a project team.
Ans:
A data analyst collaborates with business stakeholders, engineers, and data scientists to gather requirements, prepare and analyze data, visualize insights, and inform decisions with evidence-based findings.
9. What is the significance of data-driven decision-making?
Ans:
It enables businesses to base decisions on objective insights rather than intuition, leading to improved performance, reduced risk, and better strategic planning.
10. How do you ensure the accuracy and integrity of your data?
Ans:
By validating data sources, applying data cleaning steps, checking for inconsistencies, performing spot checks, and documenting all transformations and assumptions.