Accepting Applications
Full-time
Remote
Posted 5 days, 13 hours ago
4 views
0 applications
Job Description
**Position:**
Site Reliability Engineer
**Type:**
Hourly contract
**Compensation:**
$40 \- $70/hour
**Location:**
Remote
**Commitment:**
10 40 hours/week
**Role Responsibilities**
* Deploy, monitor, and recover containerized AI training environments.
* Troubleshoot infrastructure bottlenecks and resolve system failures in real time.
* Build and manage resilient systems for stability and performance optimization.
* Collaborate with engineering teams to improve CI/CD pipelines and automation.
* Manage filesystem structures, storage, and process scheduling in containerized environments.
* Execute dynamic replanning during runtime issues and system failures.
* Document system processes, solutions, and best practices.
**Requirements**
* Strong experience with terminal\-based system administration and troubleshooting.
* Expertise in containerized environments such as Docker or Kubernetes.
* Strong Python skills for scripting, automation, and debugging.
* Proficiency in Bash and familiarity with additional programming languages.
* Strong understanding of infrastructure, build systems, and version control.
* Ability to manage dynamic infrastructure recovery in high\-pressure scenarios.
* Excellent written and verbal communication skills.
**Application Process (Takes 20 Min)**
* Easy Apply on LinkedIn
* Check email for next steps
* Participate in resume evaluation \& interview stage
Login to Apply
Don't have an account? Register