Sanmina

Data Scientist – Remote

Job description

Job Description


Viking Enterprise Solutions is a supplier of storage array systems, providing solutions to customers who are seeking a flexible, scalable and resilient platform on which to build a storage, public cloud or private cloud service. We are looking for an experienced software developer to work on robust and scalable storage and AI/ML management solutions.


Job Purpose


This individual contributor is primarily responsible for designing and developing data pipelines and automation for data acquisition and ingestion of raw data from multiple data sources and data formats by transforming, cleansing, and storing data for consumption. This role is also responsible for developing detailed problem statements outlining hypotheses and their effect on target clients/customers, analyzing and investigating complex data sets and summarizing key characteristics, selecting, manipulating and transforming data into features used in machine learning algorithms, training statistical models, deploying and maintaining reliable and efficient models through production, verifying model performance, and collaborating with internal and external stakeholders across domains to develop and deliver statistical driven outcomes.


The successful candidate will work in a team responsible for architecting, building and maintaining management applications for our storage and AI/ML systems utilizing open source and third-party software.


Nature Of Duties/Responsibilities


  • Familiarity with data platforms and applications
  • Solid foundation in machine learning, applied stats, and/or experimentation
  • Experience with statistical and data programming languages, including Pyspark / Spark / Python
  • Experience with medium-to-large data sets (>1 M rows)
  • ML Experience
  • Applied experience of frameworks like PyTorch, Keras, Tensorflow, etc
  • Comfortable operating in an SDLC environment and deploying production engineering models and code


Education And Experience


Skills and knowledge: Essential


  • Completes work assignments autonomously and supports business-specific projects by applying expertise in subject area and business knowledge to generate creative solutions; encourages team members to adapt to and follow all procedures and policies. Collaborates cross-functionally and/or externally to achieve effective business decisions; provides recommendations and solves complex problems; escalates high-priority issues or risks, as appropriate; monitors progress and results.
  • Designs and develops data pipelines and automation for data acquisition and ingestion of raw data from multiple data sources and data formats by transforming, cleansing, and storing data for consumption by downstream processes; writing and optimizing diverse Python and SQL queries; and demonstrating knowledge of database fundamentals.
  • Analyzes and investigates complex data sets and summarizes key characteristics by employing data visualization methods; and determining how best to manipulate data sources to discover patterns, spot anomalies, test hypotheses, and/or check assumptions.
  • Selects, manipulates, and transforms data into features used in machine learning algorithms by leveraging techniques to conduct dimensionality reduction, feature importance, and feature selection.
  • Trains statistical models by using algorithms and data mining techniques; testing models with various algorithms to assess the input dataset and related features; and applying techniques to prevent overfitting such as cross-validation.
  • Deploys and maintains reliable and efficient models through production.
  • Verifies model performance by demonstrating expertise in the practice of a variety of model validation techniques to assess and discriminate the goodness of model fit; and leveraging feedback and output to manage and strengthen model performance.
  • Collaborates with internal and external stakeholders across domains to develop and deliver statistical driven outcomes by delivering insights and values from heterogeneous data to investigate complex problems for multiple use cases; driving informed decision-making; and presenting findings to both technical and non-technical audiences.


Additional Requirements


  • Experience working with data visualization methods.
  • Machine learning and/or algorithmic experience.
  • Statistical analysis and modeling experience.
  • Minimum One (1) year Python coding experience.
  • Bachelors degree in Mathematics, Statistics, Computer Science, Engineering, Economics, Public Health, or related field.
  • 3-5 years of experience in data science or a directly related field.
  • Additional equivalent work experience in a directly related field may be substituted for the degree requirement. Advanced degrees may be substituted for the work experience requirements.


Sanmina is an Equal Opportunity Employer – M/F/Veteran/Disability/Sexual Orientation/Gender Identity


Salary range (annual): $100,000 – $150,000 per year


In addition, Sanmina Provides a variety of benefits including health insurance coverage, life and disability insurance, savings plan, Company paid holidays and paid time off (PTO) for vacation and/or personal business.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.

Similar jobs

Browse All Jobs
Upper Hand
September 14, 2024
Niantic, Inc.
September 14, 2024
NielsenIQ
September 14, 2024

Junior Data Scientist