Sayari logo

Jr. Data Engineer

Sayari
Full-time
Remote
United States
$85,000 - $100,000 USD yearly
Engineering, Data Analytics

Sayari is looking for an Entry-Level Data Engineer to join our Data team located in Washington, DC. The Data team is an integral part of our Engineering division and works closely with our Software & Product teams, as well as other key stakeholders across the business.

Candidates Also Search: Remote Data Analytics Jobs

JOB RESPONSIBILITIES:

  • Write and deploy crawling scripts to collect source data from the web
  • Write and run data transformers in Scala Spark to standardize bulk data sets
  • Write and run modules in Python to parse entity references and relationships from source data
  • Diagnose and fix bugs reported by internal and external users
  • Analyze and report on internal datasets to answer questions and inform feature work
  • Work collaboratively on and across a team of engineers using basic agile principles
  • Give and receive feedback through code reviews

Candidates Also Search: Remote Engineering jobs

SKILLS & EXPERIENCE

Required Skills & Experience

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related technical field — or equivalent hands-on experience
  • Working knowledge of SQL and relational databases (such as Postgres)
  • Experience writing code in Python (e.g., pandas, NumPy, Scrapy) or Java/Scala
  • Familiarity with data processing frameworks like Apache Spark, or strong interest in learning them on the job
  • Understanding of object-oriented programming principles and collaborative development in shared repositories
  • Ability to work closely with data scientists, analysts, and engineers to help solve complex problems across large, diverse datasets

Candidates Also Search: Remote Jobs In United States

Desired Skills & Experience

  • Exposure to workflow orchestration tools such as Apache Airflow and CI/CD pipelines
  • Familiarity with graph, search, or NoSQL databases
  • Experience contributing to data ingestion, transformation, or ETL pipelines
  • Comfort working with containerized applications (e.g., Docker)
  • Experience using cloud-based data tools in AWS or GCP environments
  • Introductory experience or coursework involving machine learning, especially in distributed systems like Spark
  • Awareness of entity resolution concepts or interest in learning how entities are linked across data sources
  • Experience working with international or non-English datasets
Apply now
Share this job