Real-time Data Processing and Optimization project
→
Summary
Engineered a real-time data processing and optimization pipeline on GCP Dataproc, utilizing Apache Spark and HDFS for e-commerce data analytics and reporting.
Highly skilled Data Professional with demonstrated experience in ETL development, robust data pipeline construction, and cloud-based solutions across Azure and GCP platforms. Proven ability to leverage Python, SQL, Azure (ADF, Databricks, ADLS), and Apache Spark for designing and optimizing scalable data workflows, driving actionable insights and improving business processes. Seeking an Azure Data Engineer role to apply problem-solving expertise and contribute to organizational goals by building high-impact data solutions and enhancing data accessibility.
Associate (Data & Systems Support)
Kolkata, West Bengal, India
→
Summary
Currently serves as an Associate at Wipro, responsible for designing, implementing, and optimizing ETL pipelines and data integration workflows to support critical business operations and decision-making.
Highlights
Designed and implemented robust ETL pipelines using Python and SQL, processing over 25,000 daily inventory records to ensure accurate and timely order fulfillment.
Automated data ingestion and reporting workflows with Python scripts and Power BI dashboards, delivering actionable insights to 10+ stakeholders and reducing manual workload by 18+ hours per week.
Developed and maintained scalable data pipelines on Azure (ADF, Databricks) to extract, transform, and load data from diverse sources including MySQL, SQL Server, CSV, Excel, and APIs into structured datasets.
Implemented data validation checks with Python (Pandas) and optimized SQL queries for aggregations, joins, and anomaly detection, improving overall data quality by 27% and enhancing business analytics.
Collaborated with business and technical stakeholders to translate requirements into functional specifications, streamlining data workflows and reducing delivery delays by 13%.
→
Bachelor Of Technology
Mechanical Engineering
Issued By
Grow Data skills platform
Issued By
Udemy
Python, SQL, PySpark, Spark.
Azure Data Factory (ADF), Azure Databricks (ADB), Azure Data Lake Storage (ADLS), Google Cloud Platform (GCP), ETL, Data Pipelines, Data Modelling, Data Cleansing, HDFS, MongoDB, MySQL, SQL Server.
Power BI, Apache Spark, Airflow, CETAS.
Communication, Presentation, Problem-Solving, Analytical Thinking, Collaboration, Stakeholder Management.
→
Summary
Engineered a real-time data processing and optimization pipeline on GCP Dataproc, utilizing Apache Spark and HDFS for e-commerce data analytics and reporting.
→
Summary
Developed a comprehensive Azure-based data engineering pipeline leveraging ADF, ADLS Gen2, PySpark, and Power BI for ingesting, transforming, and analyzing large-scale datasets.