Data Scientist

Company:
Location: Kraków, małopolskie

*** Mention DataYoshi when applying ***

For a project with our client 1010data (https://1010data.com/) we are looking for several Data Scientists.


1010data travels at the speed of thought to make Big Data discovery easy; they power sub-second responses to analyses run on billions of rows of data. 1010data is defining the way the world interacts with data. An essential tool to more than 700 of the world's top retail, manufacturing, telecom, government, and financial services enterprises including Shell, Nespresso, Dollar General, P&G, and RiteAid; the 1010data platform is a highly differentiated product that is becoming the industry standard for Big Data Discovery and Data Sharing. With more than 30 trillion rows of data in a private cloud, 1010data is designed to scale to the largest volumes of granular data, the most disparate and varied data sets, and the most complex advanced analytics. All while delivering lightning-quick system performance.

We are seeking several Data Scientists with solid skills in data analysis to create large-scale analytical solutions (data products and applications) to client-driven business problems. At the core of 1010data's technology stack is a fast parallel processing database that powers all of our product offerings. Your work will involve utilizing this core capability in designing high-performance data pipelines and the corresponding operational processes. You will be responsible for conducting analysis to further the features, functionality, and data quality of the analytical solutions.

You have keen critical thinking, deep analytical skills, an intuitive understanding of statistical methods, and experience with handling large, nuanced datasets. You can manage your own time and project priorities to deliver under pressure. Interest and willingness to learn new tools and languages are crucial for the success of any Analyst at 1010data. You will primarily utilize the 1010data XML Macro language to interact with our core database. You will also use Airflow, Scala, Spark, and Python to varying degrees.

You will be a part of a high-performing and nimble team building data products using billions of rows of data. The Analyst role requires collaboration with the product management team, rapid prototyping of analytical solutions, building enhancements to existing data products, creating and maintaining data engineering pipelines, and handling client support. You will need to develop domain expertise on data products and be able to propose solutions to challenging problems.

What you will take on:

  • Become an expert user of 1010data XML Macro language to analyze and process massive datasets
  • Help solve real-world analytical problems by translating them into ad-hoc analyses to produce concrete findings and share them with your team
  • Develop an understanding of the domain in which 1010data's products are geared and use your knowledge of the database and platform to solve data-driven problems
  • Maintain production data pipelines to produce our well-known products with high data quality standards
  • Find and implement optimizations, improvements, and design modifications to data engineering challenges using primarily the 1010data XML Macro Language along with other technologies like Airflow, Scala, Spark, and Python
  • Collaborate with data engineering and data quality teams by conducting root cause analysis and take corrective measures to improve the quality of data products
  • Participate in agile development sprints and share progress updates regularly
  • Support clients, sales, and marketing teams as needed

What you already have:

Education & Experience

  • BS/MS in a highly analytical discipline (Computer Science /Physics /Mathematics /Econometrics) or equivalent
  • 1-3 years professional experience in data analysis or practical experience building analytical products

Skills

  • Database experience (understanding of database structures and query languages such as SQL)
  • Demonstrated experience with scripting languages and statistical software (R, SAS, SPSS, MATLAB)
  • Solid understanding of statistical concepts

Desired

  • Experience developing data products using consumer spend data
  • Experience working with parallel processing frameworks like Spark
  • Experience constructing data pipelines using Airflow
  • Background in vector/matrix arithmetic is a plus
  • Experience with list/vector-based languages

Communication and collaboration

  • Have strong, positive interpersonal skills
  • Able to communicate clearly, consider options presented by others and reach an informed, balanced technical opinion
  • Create clear, concise memos, summaries, design documentation, and presentations

As the client is based in the US, there will be an alignment with the EST timezone.

*** Mention DataYoshi when applying ***

Offers you may like...

  • Alfa Laval

    Data Scientist
    Kolding
  • Maersk

    Data Scientist
    København
  • PriceHubble

    SENIOR DATA SCIENTIST
    Wien, W
  • Waterdrop

    Data Scientist (m/f/d)
    Wien, W
  • Page Personnel Italia

    Data scientist
    Milano, Lombardia