Experience in database programming using multiple aspects of SQL and Python
Understand and translate data, analytical requirements, and functional needs into technical requirements
Build, maintain and deploy scalable data pipelines to support large scale data management projects
Ensure alignment with data strategy and standards of data processing
Experience in Big Data ecosystem - on-prem (Hortonworks/MapR) or Cloud (Dataproc/EMR/HDInsight)
Experience in Hadoop, Pig, SQL, Hive, Sqoop and SparkSQL
Experience in any orchestration/workflow tool such as Airflow/Oozie for scheduling pipelines
Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow
Understand and execute in memory distributed computing frameworks like Spark (and/or DataBricks) and its parameter tuning, writing optimized queries in Spark
Hands-on experience in using Spark Streaming, Kafka and Hbase
BCA/B.Sci/BE/BS/MTech/MS in computer science or equivalent work experience.