Hands on experience in implementing data integration processes, designing and developing data models and building in detail ETL/ELT processes or programs.
Contributed in at least 2 phases of SDLC lifecycle and experience in Big Data, data warehouse, data analytics projects, data migration, change management process, and/or any IM (Information Management) related works.
Experience with Hadoop Technologies such as HDFS/MapRFS, Map Reduce(II), Advanced HDFS ACLS, Hive, HBase, Cassandra, Impala, Spark, Sqoop, Kafka, Nifi, Flink, Druid, Zookeeper and zkClient tool
Good understanding on Cloudera or Horton Works distributions
Experience in working with RDBMS technologies such as, Oracle, Microsoft SQL Server, PostgreSQL, DB2, MySQL, Maria DB, etc .
Hands-on experience on Spark, SparkSQL, Hive QL, Impala, Spark Data Frames and Flink CEP, as ETL framework
Strong knowledge of Big Data stream ingestion and stream processing using Kafka and Spark Structured Streaming, Flink
Good understanding Spark Memory management with and without Yarn memory management
Should have experience developing and designing in one or more NoSQL database technologies such as Cassandra, Mongo, HBase, CouchDB/Couchbase, Elasticsearch etc.
Should good working knowledge of HCatalog and Hive Metadata.
Should have working knowledge of Kerberos authentication tool
Good knowledge of data warehouse and data management implementation methodology.
Knowledge and experience in data visualisation concepts using tools such as Tableau, Microsoft PowerBI or QlikView etc. will be an advantage.
Ability to pick up new tools and able to be independent with minimal guidance from the project leads/managers.
Hands-on programming skill on Scala/Python using Spark/Flink Framework