Negotiable
Outside
Remote
USA
Summary: The AI/ML Site Reliability Engineer role focuses on ensuring the reliability, scalability, and automation of database services, particularly those involving vector databases and AI technologies. The position requires expertise in managing and optimizing database performance while collaborating with AI/ML and data engineering teams. The role is fully remote and emphasizes a strong understanding of GenAI Foundation Models and various vector database technologies. Candidates should be adept in an Agile DevOps environment, ensuring high availability and security of database systems.
Key Responsibilities:
- Take ownership for reliability, scalability, automation, and uptime of database services.
- Understand and leverage GenAI Foundation Models and Vector DB technologies.
- Evaluate different database technologies for AI and RAG capabilities.
- Install, configure, and maintain vector databases on hybrid infrastructure platforms.
- Design and manage embedding storage architectures for high-dimensional vector search.
- Monitor and improve performance and scalability of vector databases for large-scale deployments.
- Manage data ingestion, indexing strategies, and re-indexing tasks.
- Ensure data consistency, replication, backups, and disaster recovery plans.
- Implement security best practices including access controls and encryption.
- Collaborate with AI/ML and data engineering teams for integration with NLP, CV, and recommendation systems.
- Document configurations, architecture decisions, and operational procedures.
- Thrive in an Agile DevOps environment managing database availability and reliability.
Key Skills:
- Strong understanding of GenAI Foundation Models and Vector DB technologies.
- Experience with various vector databases such as Pinecone, Weaviate, Memgraph, Milvus.
- Knowledge of embedding storage architectures and high-dimensional vector search.
- Ability to monitor and improve database performance and scalability.
- Experience with data ingestion and indexing strategies.
- Knowledge of disaster recovery and data consistency practices.
- Familiarity with security best practices in database management.
- Collaboration skills with AI/ML and data engineering teams.
- Documentation skills for configurations and operational procedures.
- Experience in an Agile DevOps environment.
Salary (Rate): undetermined
City: undetermined
Country: USA
Working Arrangements: remote
IR35 Status: outside IR35
Seniority Level: undetermined
Industry: IT
Role: AI/ML Site Reliability Engineers
Location: 100%Remote
Weaviate DB and MEMGraph
Site Reliability Engineers are responsible and take ownership for reliability, scalability, automation, and other aspects related to uptime and availability of our database services. You will need to have
strong skills in following areas:
Understanding of GenAI Foundation Models and Vector DB: Leveraging foundational AI models and Vector database technologies for advanced AI capabilities.
Evaluate different DB technologies for AI and RAG capabilities
Install, configure, and maintain vector databases such as Pinecone, Weaviate, Memgraph, Milvus, etc on hybrid infrastructure platforms
Design and manage embedding storage architectures optimized for high-dimensional vector search.
Monitor and improve the performance and scalability of vector databases for large-scale deployments.
Manage data ingestion, indexing strategies (e.g. HNSW, IVF, Annoy), and re-indexing tasks.
Ensure data consistency, replication, backups, and disaster recovery plans are in place.
Implement security best practices including access controls, encryption, and audit logging.
Collaborate with AI/ML and data engineering teams to integrate vector databases with NLP, CV, and recommendation system pipelines; assess large and varied data sources and help development teams design RAG applications.
Document configurations, architecture decisions, and operational procedures.
Knowledge and ability to thrive in an Agile DevOps environment; responsibility to manage database availability, scalability and reliability with an automation approach.