Senior Software Engineer
LanceDB · HQ
ABOUT LANCEDB LanceDB http://lancedb.com/ is the preeminent data platform for multimodal AI use cases. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today. ABOUT THE ROLE We’re looking for a Senior Software Engineer to help expand the reach of Lance and LanceDB within the broader data infrastructure ecosystem. You’ll work at the intersection of high-performance computing, big data, and open-source systems. You will contribute scale and performance improvements, integrations with the wider data and AI ecosystem, simplifying distributed operations, and usability and maintainability enhancements. YOU’LL BE RESPONSIBLE FOR - Designing and maintaining efficient distributed Lance dataset operations - Building efficient indices to enable predicate pushdown and accelerate queries in Spark, Ray, or Trino - Working on table formats, data encodings, and various aspects of the Lance format in Rust - Driving open-source community efforts to integrate the Lance format with Spark, Hive Metastore, Presto, Trino, Ray, and other data infrastructure systems - Operating and improving internal data processing infrastructure - Promoting the Lance format in open-source communities and at Big Data conferences REQUIREMENTS - 10+ years of experience building high-performance databases, big data systems, or large-scale data services - Deep understanding of internals of open-source Big Data or AI training systems (e.g., Hadoop, Spark, Flink, Ray, Iceberg, Delta Lake, Hudi, ClickHouse, Trino, Presto, PyTorch, or JAX) - Strong experience with high-performance computing in C++, Java, and/or Scala - Experience with Rust (or willingness to learn it) - Proven ability to move fast, work independently, and collaborate with a high-