Distributed Data Platforms HDFS, Knowledge of data cleaning, wrangling, visualization and reporting, with an understanding of the best, most efficient use of associated tools and applications to complete these tasks.
Experience in MapReduce is a plus.
Deep knowledge of data mining, machine learning, natural language processing, or information retrieval.
Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources.
A willingness to explore new alternatives or options to solve data mining issues, and utilize a combination of industry best practices, data innovations and your experience to get the job done.
Good to have Elasticsearch, Splunk, Casandra; Good to have Cluster Compute Technologies Spark, Akka, NoSql DB
Good to have Distributed data; Hortonworks toolset, Zookeeper, HDFS
To be responsible for providing technical guidance or solutions ;define, advocate, and implement best practices and coding standards for the team.
To ensure process compliance in the assigned module, and participate in technical discussion sor review as a technical consultant for feasibility study (technical alternatives, best packages, supporting architecture best practices, technical risks, breakdown into components, estimations).
To develop and guide the team members in enhancing their technical capabilities and increasing productivity
To prepare and submit status reports for minimizing exposure and risks on the project or closure of escalations.