Sr Data Engineer with TS/SCI

Company:
Location: United States

*** Mention DataYoshi when applying ***

Description:


OUR MISSION

Vertical Knowledge, a leader in the Alternative Data industry and provider of an end-to-end public data platform. Our Robotic Process Automation (RPA) framework automates internet-based tasks that support accurate, reliable and compliant alternative data from outside the enterprise. The Vertical Knowledge RPA framework platform is unique in offering the full range of RPA, Intelligent Automated Software (IAS) and Intelligent Automated Workflow (IAW) solutions to discover, connect, collect, enrich and integrate data from outside of the enterprise.

We enable our clients to access and understand information that is publicly available but very challenging to discover, capture, and curate. Our clients demand excellence, innovation, and trust in our proprietary technology and methodologies to deliver them world-class data solutions for actionable insight.

We are currently looking for an experienced Senior Data Engineer with a current TS/SCI security clearance with a CI polygraph to join its mission critical team located in Washington, DC. All candidates must have the knowledge and experience to effectively analyze and produce intelligence analysis products. A competitive candidate should have the following experience and knowledge, which must be clearly described in the resume, along with current security clearance status.

CORE RESPONSIBILITIES

  • Support the identification, prioritization, and scheduling of data modeling and processing requirements with users
  • Report the status of all data extraction, transformation, and load activities
  • Re-construct data provided in XML, delimited text, email (e.g. eml, mbox, pst), and a variety of database systems (SQL, Server, Oracle, PostgreSQL, MySQL)
  • Apply semantic data modeling techniques to classify, aggregate, and generalize data stored in hierarchical, network, or relational database management systems to define the meaning of data within the context of its interrelationships with other data
  • Validate semantic data models with users
  • Transform semantic data models into physical database designs
  • Design physical database management systems to represent semantic data models, including relational and object-relational Databases (e.g. Postgress, SQL, Server, MySQL), Key value stores, Inverted Indexes (Lucene, Elastic Search), and distributed file systems (e.g. Tachyon, HDFS)
  • Write software code and scripts, and use COTS, GOTS, and open source softward to extract objects (e.g. entities, events, documents, and relationships) from structured and unstructured data and multimedia (e.g. exif)
  • Create and maintain a repository of software code and scripts (e.g. Java and Python), for rapidly extracting, transforming, and loading a variety of structured and unstructured data sources
  • Integrate software code and scripts for the automation of repeatable extract, transform, and load
  • Provides technical support for data services and data management in a multi-cloud and multi-domain environment
  • Lead the design of physical data base management systems to represent semantic data models, including relational and object-relational Databases (Postgress, SQL Server, MySQL), Key value stores, Inverted Indexes (Lucene, Elastic Search), and distributed file systems (e.g. Tachyon, HDFS)
  • Lead the design and execution of semantic data modeling techniques to classify, aggregate, and generalize data stored in hierarchical, network, or relational database management systems to extract context through interrelationships with other data Direct the execution of data science methods using parallel computing frameworks (e.g. deeplearning4j, Torch, Tensor Flow, Caffe, Neon, NVIDIA CUDA Deep Neural Network library (cuDNN), and OpenCV)) and distributed data processing frameworks (e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop), Spark (including MLib, GraphX, SQL and Dataframes)
  • Support multiple simultaneous projects and take open-ended or high-level guidance, independently and collaboratively make discoveries that are mission-relevant, and package and deliver the findings to both technical and non-technical audiences
  • Help develop the requisite team of data engineers for delivering database performance
  • Collaborate with other team members to engineer database performance as part of the total data enterprise
  • Lead special emphasis on defining the required multi-model or polyglot data storage technologies in order to achieve mission value
  • Collaborate with other tech teams to implement advanced analytics algorithms that exploit our rich datasets for statistical analysis, prediction, clustering and machine learning
  • Manage technical staff and technical resources to priority needs
  • Contribute to professional development and work culture for high performing teams
. Requirements:

REQUIRED SKILLS & EXPERIENCE

  • Bachelor Degree or higher in a quantitative or analytical field such as Computer Science, Mathematics, Economics, Statistics, Engineering, Physics, or Computational Social Science; or Master’s degree or equivalent graduate degree including certificate-based advanced training courses
  • 10+ years of industry experience in software development, data engineering, business intelligence, data science, or related field with a track record of manipulating, processing, and extracting value from large datasets
  • Current TS/SCI security clearance with a CI polygraph
  • Excellent oral and written communication skills
  • Highly analytical with a knack for analysis, math and statistic using critical thinking and problem-solving skills which are essential for interpreting data
  • Possess both an understanding of real-world mission/business objectives and a working grasp of software development practices and technologies
  • Significant experience in one or more scripting languages such as R (statistics), Python, Scala, or Java.
  • Experience working with a hybrid team of analyst, engineers, and developers to conduct research, and build and deploy complex, but easy-to-use analytical platforms
  • Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
  • Proficient with descriptive and inferential statistics to describe data and make predictions about the data, including statistical tests to determine confidence for a hypothesis
  • Familiarization with natural language processing, computer vision, signal processing, and speaker and speech recognition algorithms to identify objects in text, image, video, and audio files
  • Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
  • Ability to analyze and assess software development or data acquisition requirements and determine optimum, cost-effective solutions
  • Significant experience dealing with open source, publicly available information (PAI) data sources and types
  • Travel: less than 10%
  • Location: Washington DC – customer proprietary
  • This is a full time benefits eligible opportunity with VK contingent on customer contract

PREFERRED QUALIFICATIONS

  • A significant share of data analytics experience in direct support of military or intelligence community customers, demonstrating progressive technical development and mission-focused outcomes
  • Significant experience dealing with open source, publicly available information (PAI) data sources and types
  • Significant experience performing quantitative analysis to support military or intelligence community operational activities
  • Previous experience developing predictive algorithm
  • Experience supporting cyber security or network security operations
  • Familiarity with social network analysis, supply chain analysis, forensic accounting, pattern of life, natural language processing, social media analysis, classification algorithms, and/or image processing
  • Experience blending analytical methodologies and leveraging existing COTS/GOTS/OS tools in unconventional manners
  • Travel: less than 10%
  • Location: Washinton DC – customer proprietary
  • This is a full time benefits eligible opportunity with VK and is contingent on customer contract

COMPETITIVE BENEFITS & SALARY

  • Company paid premiums for employees and their dependents, such as medical, dental, vision, disability and life
  • Employer Health Savings Account (HSA) contributions
  • 401(k) retirement plan with financial planning
  • Unlimited vacation days
  • Flexible work schedules
  • Fully stocked kitchen with snacks and beverages (HQ only)
  • A casual work environment and dog friendly office (HQ only)

Vertical Knowledge LLC is an Equal Opportunity and Affirmative Action Employer. VK is committed to the policy of an equal employment opportunity in recruitment, hiring, career advancement, and all other personnel practices. VK will not discriminate on the basis of race, color, sex, national origin, religion, age, marital status, personal appearance, sexual orientation, gender identity or expression, family responsibilities, matriculation, political affiliation, genetic information, disability, past or current military service, or any other legally protected characteristic.

Reasonable Accommodations for Applying

Applicants who qualify under the Americans with Disabilities Act, as Amended, may be eligible for a reasonable accommodation in Vertical Knowledge LLC application and selection processes. If you require an accommodation in applying through our online system, please send an email to Careers@vk.ai. Please note that this email address is not to be used for checking on the status of an application.

*** Mention DataYoshi when applying ***

Offers you may like...

  • The Denzel Group

    Sr Data Engineer – Mostly remote!
    Allentown, PA 18195
  • IT Concepts

    Sr Data Scientist (Kessel Run)
    Remote
  • PedidosYa

    Sr Data Scientist
    Desde casa
  • PedidosYa

    Sr Data Analyst
    Buenos Aires, Buenos Aires
  • Publicis Groupe

    Ssr Data Analyst
    Buenos Aires, Buenos Aires