Data Engineer

Subsplash

Full-time

Remote

United States

$120,000 - $138,000 USD yearly

Engineering, Software & Technology

As a Data Engineer, you will be responsible for all things related to the collection, extraction, transformation, and correlation of business data across the Subsplash platform. You will report to the Site Reliability Engineering Manager. You will be an expert on our production data sources, and how to administer and tune data systems to optimize for performance. You will also work regularly with our data warehousing/data lake environments to help provide our business analysis and intelligence team with data marts. Data collection, extraction, and transformation is often achieved through Python code development and maintenance. In a remote-first, distributed team environment, you will work well with other team members to deliver working data engineering solutions early and often.

Top 3 Key Outcomes in Year 1

Work with the lead Data Engineer to assume primary responsibilities for operating and monitoring Extract-Load-Transform (ELT) data pipelines, routine ELT pipeline add/change tasks, and continual improvement. Success in this outcome will be measured by the capacity freed up for the senior data Engineer.
Serve as a point of escalation for any questions related to SQL query performance or optimization of Subsplash data systems. This may include analysis of slow query reports and execution plans or other questions from software engineers.
Enable product teams, business analysis teams, and other stakeholders to integrate data into the Snowflake data warehouse system, and then access and analyze their data via Sigma, and/or Tableau.
Work with Data Engineering and Site Reliability Engineering (SRE) to continually improve observability and proactive alerting for Extract-Load-Transform (ELT) data pipelines.

Key Responsibilities:

Operate and maintain Subsplash data warehousing environment, consisting of DBT, Python, Terraform, AWS DMS, Snowpipe, and related ELT tools, functioning in an AWS Kubernetes infrastructure, and maintaining in GitLab SCM and CI/CD.
Ensure PII and other sensitive data are properly handled within and while being transformed and brought into the data warehouse
Build and maintain our ETL pipelines from production data stores into the data warehouse
Monitor and optimize production data stores
Assist in building and maintaining data visualizations, both internal-facing and customer-facing
Collaborate with business analysts, product managers, and software engineers to build and verify hypotheses related to business intelligence

Qualifications

2+ years of experience as a Data Engineer or in a similar role
Experience with data modeling, data warehousing, and building ETL pipelines
Extremely comfortable with SQL
Excellent analytical abilities
Comfortable with ambiguity in requirements and able to be a self-starter
Excellence communication (verbal and written) and interpersonal skills -- the ability to communicate with both business and technical teams
Experience working with Snowflake or similar data platforms (i.e. AWS Redshift, BigQuery)
Strong knowledge of relational databases (e.g. MySQL, MariaDB, PostgreSQL, Aurora) and document-oriented databases (e.g. MongoDB, DynamoDB)
Strong organizational skills and the ability to learn new technologies quickly

Preferred Qualifications

Knowledge of a programming language (Go, Python, Javascript)
Familiar with ELT tools such as DBT, Fivetran, Meltano
Data Science experience (e.g. machine learning, artificial intelligence)
Bachelor's degree in Computer Science, Mathematics, Statistics, or a related field

Apply now

Share this job

Data Engineer

More jobs

Junior Test Automation Engineer

Vultr

Principal Software Engineer - Cloud Networking

Vultr