Senior Site Reliability Engineer Job at Sustainable Talent, Santa Clara County, CA

WHpaeUpmREIxWHlIT0VRYUdSVWtTdWppZlE9PQ==
  • Sustainable Talent
  • Santa Clara County, CA

Job Description

Sustainable Talent is partnering with Nvidia a global leader who's been transforming computer graphics, PC gaming, and accelerated computing for over 25 years. We are looking for a S RE & DevOps Engineer to support our client's Infrastructure, Planning and Processes organization.

This is a W-2 full-time contract based in Santa Clara, CA, Onsite. We offer competitive pay based on factors like experience, education, location, etc. and provide full benefits, PTO, and amazing company culture!

NVIDIA is looking for a seasoned SRE & DevOps Engineer to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization. The position will be part of a fast-paced crew that develops and maintains sophisticated NVIDIA's internal infrastructure products. The team works with various other business units within NVIDIA Software such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure & systems needs.

As an SRE & DevOps engineer, you’ll also be working in conjunction with various teams such as software engineering to deploy these new products and manage our infrastructure, associated processes and systems. Keen attention to detail, problem-solving abilities, and a solid knowledge base are essential.

What you’ll be doing:

  • Working on systems deployed in NVIDIA's internal infrastructure products and them available and reliable for our end users.
  • Monitor system performance and troubleshoot issues related to Nvidia hardware and software stack.
  • Providing high quality of user support.
  • Monitoring KPIs and making sure that team’s SLAs are met.
  • Managing and maintaining production Kubernetes clusters and Jenkins pipelines.
  • Drive automation of monitoring to gain more insight into applications and system health.

What we need to see:

  • Experience of maintaining cloud and CI/CD on-prem infrastructure and highly-available production environments.
  • Expert level proficiency in CI/CD systems like ArgoCD, Jenkins, Gitlab CI, Github actions etc.
  • Background in Databases like SQL (MySQL) and timeseries DBs like Prometheus.
  • Experience with data analytics/visualization tools like ELK, Grafana, Splunk etc. and alerting tools like Zabbix, Alertmanager and Pagerduty.
  • Proficient with Ansible, Kubernetes, Containers & Virtualization platforms.
  • 5+ years of proven experience along with Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent experience.

Ways to stand out from the crowd:

  • Previous experience with SRE teams managing on-prem infrastructure.
  • Experience managing NVIDIA hardware like GPUs and Tegras.
  • Thrives in a multi-tasking environment with constantly evolving priorities.
  • Prior experience with large scale operations team.

Sustainable Talent is a M/F+, disabled, and veteran equal employment opportunity and affirmative action employer.

Job Tags

Full time, Contract work, Gangs,

Similar Jobs

United Vision Logistics

Regional Truck Driver Owner Operator - 1yr EXP Required - Flatbed - United Vision Logistics Job at United Vision Logistics

 ...Partnering with Owner Operators in Your Area!. We're always looking for qualified and professional owner/ operators, fleet owners, and...  ...Logistics is nationally ranked in the Top 20 for- hire flatbed specialized carriers and the Top 50 for-hire truckload carriers... 

Leidos

Warehouse Associate Job at Leidos

 ...Supply System and NAVSUP Weapon System Support Inventory Management, Depot Level Repairables (DLRs) and Secondary Reparables (SECREPs)....  ...logistics operations. Proficient in the use of Microsoft Office Excel/Access and other software. Must be able to manage, utilize... 

FEMINIST

Union Organizer-in-Training / Public Sector Campaign Job at FEMINIST

SEIU (Service Employees International Union)Job Title: Union Organizer-in-Training / Public Sector CampaignSalary: $54,080Location: Fairfax, VirginiaPurpose:The SEIU Organizer-in-Training (OIT) Program is a 12-month training program.As an Organizer-in-Training with... 

Summit Medical Staffing Nursing

Travel Nurse RN - Case Management - $2,249 per week Job at Summit Medical Staffing Nursing

 ...Summit Medical Staffing Nursing is seeking a travel nurse RN Case Management for a travel nursing job in Louisville, Kentucky. Job Description & Requirements ~ Specialty: Case Management ~ Discipline: RN ~ Start Date: 07/07/2025~ Duration: 13 weeks ~40... 

Gentle Dental

Dental Assistant Job at Gentle Dental

 ...At Gentle Dental, we value our teammates smile as much as our patients smile. Our vision is to provide exceptional, lifelong...  ...Schedule: Full-Time Benefits ~$25-$30 per/hr DOE ~ Employee Assistance Program ~ Medical and pharmacy, dental, vision (for employees...