Bhavin Tandel
Senior Data Engineer
Summary
Bhavin brings great range of expertise in Hadoop and Cloud technologies varying from building to productionizing the data applications. He has consulted various clients to help them achieve their data journey using open source technologies and managed cloud services. Contact
Certification
- AWS Solution Architect - Associate
- Hortonworks Data Platform Certified Administrator
Skills
Data Engineering
- Spark
- HDFS
- YARN
- Hive
- Kafka
- Athena
- Glue
- Lambda
- S3
- EC2
Devops
- Bash
- Ansible
- Python
- Cloudformation
- Gitlab CI
Programming Languages
- Python
- Scala
Education
University of Leicester, Leicester, UK (2014 – 2015)
Master’s in Data Analysis for Business Intelligence
Maharashtra Institute of Technology, Pune, India (2009 – 2013)
Bachelor’s in Electronics and Telecommunication Engineering
Work Experience
Senior Data Engineer, Cloudwick
May 2019 – present
Amorphic Datalake (May 2019 - present)
Building serverless datalake platform on amazon web service platform using cloud native tools like Lambda, API gateway, Cloudformation and Dynamodb. Follow Agile methodology with continuous integration using gitlab’s CI/CD
Big Data Consultant, Cloudwick
Mar 2016 – May 2019
Banking Services (Apr 2019 – May 2019)
- Defining plan and time estimation for migration tasks.
- Involved in migration of data (~300 TB) between two hadoop clusters with minimal downtime.
- Developed utilities to validate the integrity of hive data after migration.
Financial Services (Sep 2018 – Feb 2019)
- Built streaming ETL application by leveraging the state-of-the-art opensource technologies (Apache Spark Streaming and Kafka).
- Implemented back off retry strategy to handle the failures and backpressure.
- Implemented watermarking mechanism for the ETL pipeline using Spark stateful streaming.
Electric Meter Services (Apr 2018 – Aug 2018)
- Delivered fully Serverless Reporting application on AWS cloud using AWS serverless technologies.
- Developed ETL pipeline using AWS Lambda and Glue(pyspark) replacing the human intervention for report generation.
- Handled daily incremental load in AWS S3 reducing the cost to deploy RDBMS.
- Operationalized the workflow using Github, AWS Lambda, AWS Codebuild and Cloudformation.
- Built Power BI dashboards from Athena and handled the access using roles.
Insurance Services (Mar 2017 – Apr 2018)
- Designed and delivered data science platform on Hadoop platform using Jupyterhub, Zeppelin, Anaconda and Git.
- Implemented custom monitoring scripts in python for Jupyterhub /Zeppelin and integrated it with Grafana for visualization.
- Worked alongside data engineering team to optimize data ingestion tasks and ETL jobs.
- Supported third party application provider to integrate various tools with Hadoop platform.
- Assess the requirements and effort needed to enable platform for sensitive personal data.
- Supported installation and upgradation of the Hadoop Data Platform.
Chemical Producing Company (Oct 2016 - Dec 2016)
- Built and configure secured Hadoop cluster for multi-tenant environment.
- Implemented multi-layer security architecture using Active directory, Kerberos, Ranger, Knox and Wire Encryption/SSL.
- Deployed Nifi cluster to ingest data from legacy system into Hadoop Data Layer.