Site Reliability Engineer
Fractal - San Mateo, CA
Apply NowJob Description
Site Reliability EngineerFractal Analytics is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets. An ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite empowers imagination with intelligence. And that it will be such Fractalites that will continue to build the company for the next 100 years.Please visit Fractal | Intelligence for Imagination for more information about Fractal.Please Note: This role is specifically located in the Bay Area of San Francisco. You will need to work onsite Monday - Friday. We offer paid relocation.Role OverviewAs a Site Reliability Engineer with Fractal, you will be dedicated to ensuring the highest system availability and performance levels. This role involves comprehensive monitoring, addressing complex technical issues, automating solutions to recurring problems, and contributing to developing resilient system architectures and deployment strategies. You will work closely with our Services and Engineering teams, playing a crucial role in optimizing our platforms and infrastructures.ResponsibilitiesEnsure maximum uptime and system availability to meet or exceed functional and performance SLAs.Implement thorough end-to-end monitoring and alerting on all critical components to ensure quick detection and response.Tackle complex challenges affecting critical services, focusing on automating problem resolution to prevent future occurrences.Drive the development of innovative designs, architectures, standards, and methodologies to support and enhance our platform.Lead in scripting and automation efforts, aiming to refine system updates and upgrade processes.Design and configure essential infrastructure, tools, and frameworks to enhance the deployment lifecycle.Collaborate effectively with cross-functional teams within Services and Engineering.QualificationsHave interest and ability to become certified on the end client AI platform. (We will provide all the necessary training and support)Bachelor's or master's degree in computer science, a related field, or equivalent professional experience.Minimum of 10 years of relevant experience.Proven experience in deploying, managing, and optimizing scalable, fault-tolerant Linux/Kubernetes/JVM infrastructure across various cloud platforms like AWS, GCP, and Azure.Deep expertise in Linux Operating Systems, Networking principles, and Database management.Practical experience with Cassandra or similar NoSQL technologies.Proficiency with major cloud services providers, notably AWS, Azure, and GCP.Familiarity with configuration management tools such as Ansible or Terraform.Proficiency in programming languages like Ruby or Python, particularly for system automation and monitoring.Strong problem-solving abilities, critical thinking skills, and effective communication capabilities.Prior experience in a DevOps or system administration role, ideally supporting commercial SaaS solutions.Pay:The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions, including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $110,000 - $160,000. In addition, you may be eligible for a discretionary bonus for the current performance period.Benefits:As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a "free time" PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Created: 2025-01-25