Job Description:
Responsibilities:
· Design and build reusable components, frameworks, and libraries at scale to support analytics products.
· Design and implement product features in collaboration with business and Technology stakeholders.
· Identify and solve issues concerning data management to improve data quality
· Clean, prepare and optimize data for ingestion and consumption
· Collaborate on the implementation of new data management projects and re-structure of the current data architecture
· Implement automated workflows and routines using workflow scheduling tools
· Build continuous integration, test-driven development, and production deployment frameworks
· Analyze and profile data for designing scalable solutions
· Troubleshoot data issues and perform root cause analysis to proactively resolve product and operational issues
Requirements:
Experience:
· Strong understanding of data structures and algorithms
· Strong understanding of solution and technical design
· Has a strong problem solving and analytical mindset?
· Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
· Able to quickly pick up new programming languages, technologies, and frameworks
· Experience building cloud scalable, real time and high-performance data lake solutions
· Fair understanding of developing complex data solutions
· Experience working on end-to-end solution design
· Willing to learn new skills and technologies
· Has a passion for data solutions
Required and Preferred Skill Sets:
· Hands on experience in AWS - EMR [Hive, Pyspark], S3, Athena or any other equivalent cloud
· Familiarity with Spark Structured Streaming
· Minimum experience working experience with Hadoop stack dealing huge volumes of data in a scalable fashion
· hands-on experience with SQL, ETL, data transformation and analytics functions
· hands-on Python experience including Batch scripting, data manipulation, distributable packages
· Experience working with batch orchestration tools such as Apache Airflow or equivalent, preferable Airflow
· Working with code versioning tools such as GitHub or Bitbucket; expert level understanding of repo design and best practices
· Familiarity with deployment automation tools such as Jenkins
· hands-on experience designing and building ETL pipelines; expert with data ingest, change data capture, data quality; hand on experience with API development.
· Designing and developing relational database objects; knowledgeable on logical and physical data modelling concepts; some experience with Snowflake
· Familiarity with Tableau or Cognos use cases
· Familiarity with Agile; working experience preferred