Bhavin Tandel

Senior Data Engineer

Summary

Bhavin brings great range of expertise in Hadoop and Cloud technologies varying from building to productionizing the data applications. He has consulted various clients to help them achieve their data journey using open source technologies and managed cloud services. Contact

Certification

AWS Solution Architect - Associate
Hortonworks Data Platform Certified Administrator

Skills

Data Engineering

Spark
HDFS
YARN
Hive
Kafka
Athena
Glue
Lambda
S3
EC2

Devops

Bash
Ansible
Python
Cloudformation
Gitlab CI

Programming Languages

Python
Scala

Education

University of Leicester, Leicester, UK (2014 – 2015)

Master’s in Data Analysis for Business Intelligence

Maharashtra Institute of Technology, Pune, India (2009 – 2013)

Bachelor’s in Electronics and Telecommunication Engineering

Work Experience

Senior Data Engineer, Cloudwick

May 2019 – present

Amorphic Datalake (May 2019 - present)

Building serverless datalake platform on amazon web service platform using cloud native tools like Lambda, API gateway, Cloudformation and Dynamodb. Follow Agile methodology with continuous integration using gitlab’s CI/CD

Big Data Consultant, Cloudwick

Mar 2016 – May 2019

Banking Services (Apr 2019 – May 2019)

Defining plan and time estimation for migration tasks.
Involved in migration of data (~300 TB) between two hadoop clusters with minimal downtime.
Developed utilities to validate the integrity of hive data after migration.

Financial Services (Sep 2018 – Feb 2019)

Built streaming ETL application by leveraging the state-of-the-art opensource technologies (Apache Spark Streaming and Kafka).
Implemented back off retry strategy to handle the failures and backpressure.
Implemented watermarking mechanism for the ETL pipeline using Spark stateful streaming.

Electric Meter Services (Apr 2018 – Aug 2018)

Delivered fully Serverless Reporting application on AWS cloud using AWS serverless technologies.
Developed ETL pipeline using AWS Lambda and Glue(pyspark) replacing the human intervention for report generation.
Handled daily incremental load in AWS S3 reducing the cost to deploy RDBMS.
Operationalized the workflow using Github, AWS Lambda, AWS Codebuild and Cloudformation.
Built Power BI dashboards from Athena and handled the access using roles.

Insurance Services (Mar 2017 – Apr 2018)

Designed and delivered data science platform on Hadoop platform using Jupyterhub, Zeppelin, Anaconda and Git.
Implemented custom monitoring scripts in python for Jupyterhub /Zeppelin and integrated it with Grafana for visualization.
Worked alongside data engineering team to optimize data ingestion tasks and ETL jobs.
Supported third party application provider to integrate various tools with Hadoop platform.
Assess the requirements and effort needed to enable platform for sensitive personal data.
Supported installation and upgradation of the Hadoop Data Platform.

Chemical Producing Company (Oct 2016 - Dec 2016)

Built and configure secured Hadoop cluster for multi-tenant environment.
Implemented multi-layer security architecture using Active directory, Kerberos, Ranger, Knox and Wire Encryption/SSL.
Deployed Nifi cluster to ingest data from legacy system into Hadoop Data Layer.

back