Founding Machine Learning Engineer

Remote (United States) San Francisco, CA

Company Website:

Company/Founders’ Location: Menlo Park, California

Hiring Location:  Local to SF/BayArea, California preferred

About Bluesky:

Bluesky is a platform that helps organizations optimize costs and maximize returns on their Snowflake investment. Although Snowflake has made analytics accessible to many and enabled faster business decisions, it has also created inefficiencies at scale. Our platform offers budgeting and forecasting tools, workload optimization, query analysis, anomaly detection, and monitoring and alerting features that provide customers with deep visibility. With Bluesky, customers can expect 500X query improvements and 30% cost reductions, which makes it a no-brainer for any enterprise.

Our team consists of domain experts in big data who have solved similar problems for companies such as Google, Vertica, Uber, Stripe, Facebook, Pure Storage, DBT, and Snowflake, and other thought leaders in the industry. Our company has received funding from top-tier VCs, angels, and various thought leaders, including the founders of Cloudera and Qubole. We are committed to building a world-class team that will develop systems with a profound impact on how people use their data clouds.

We're looking for a Founding Software Engineer passionate about tackling big data challenges in the next-generation data clouds. As a crucial early team member, you'll play a significant role in making important architectural decisions that will shape our technology stack for years to come. You'll also have the chance to make an immediate impact on the big data stack at multiple well-known tech and non-tech companies who are our customers. Additionally, you'll receive generous equity compensation with significant growth potential.


  • Performance Tuning - Analyze and improve the performance and efficiency of the ETL pipelines by identifying bottlenecks, finding out root causes, applying best practices like incremental computation and parallel computation, and dogfooding Bluesky products.
  • Development - Develop high-performance and scalable ETL pipelines, leveraging a powerful stack of modern tools and frameworks for data integration, transformation, and orchestration.
  • Management - Effectively configure, monitor, and scale production environment involving Snowflake (similar DataWarehouse), dbt, Airbyte, and Prefect (any modern data frameworks / tools), ensuring seamless operation and optimal performance.
  • Stewardship - Own or support the data definitions and lineage across our entire data warehouse.
  • Mentoring - Help teach other team members about data architecture and also be a consultant for developers who need help with data.
  • Learning - Stay current on technical knowledge and tooling to help the team utilize new technologies.

Must Haves:

  • Bachelor’s degree in a technical field or equivalent
  • 5+ years of software engineering experience at a senior/staff level
  • Rock solid engineering fundamentals with an understanding of query processing internals
  • Practical experience in data engineering and data warehousing
  • Fluent in object-oriented programming and SQL optimization
  • Proven track record of leading large projects
  • Strong problem-solving skills and communication skills

Nice to Haves:

  • 1+ years of experience with Snowflake
  • 1+ years of experience with dbt
  • Experience with other big data platforms: Databricks, BigQuery, Redshift, Spark, Presto
  • Experience working with metadata stores or APIs in big data platforms
  • Experience with cloud deployments and containers

Why Join Bluesky

We raised $8.8M to solve the biggest challenges of big data cloud adoption: the cost management and workload optimization. We are backed by Greylock venture capitalists. Our founder, Mingsheng Hong has decades of experience in the big data industry, and is behind the names of well-known big data technologies like Apache Hive and Vertica. Bluesky has also officially joined the Snowflake Partner Network. We achieved Select Tier status less than a month after coming out of stealth. Please see techcrunch and data engineering podcast for details.

View all job listings here