SRE Intern

  • Industry Other
  • Category IT&Telecommunication
  • Location Lalitpur, Nepal
  • Expiry date Jul 24, 2025 (2 days left)
Job Description
Join our team as an Site Reliability Engineer Intern and gain hands-on experience in monitoring, automation, and infrastructure management. You'll work with tools like Grafana, Prometheus, and GitLab, assist in troubleshooting, and contribute to CI/CD pipelines. Ideal for students passionate about system reliability, scripting, and DevOps practices.Key Responsibilities

Monitoring & Metrics:

  • Monitor network and system metrics using tools like Grafana, Prometheus, and Zabbix.

Infrastructure Management:

  • Perform day-to-day infrastructure tasks and troubleshooting.

  • Updating and maintaining VM servers, such as the GitLab server.

  • Work hands-on with Linux systems, Windows servers, and virtualization environments.

Automation & Scripting:

  • Write automation scripts using Python and Bash.

Logging & Incident Handling:

  • Manage and analyze logs using Graylog.

  • Collaborate with team members to troubleshoot and resolve basic incidents and system alerts.

CI/CD & Deployment:

  • Create, maintain, and improve CI/CD pipelines to automate testing, deployment, and delivery of applications.

Documentation & Recovery:

  • Test and document Disaster Recovery and Failover procedures.

  • Document processes, system architecture, and technical solutions.


Learning Opportunities


  • Real-world exposure to Site Reliability Engineering principles.

  • Understanding of how monitoring and alerting systems works.

  • Hands-on experience with automation and monitoring tools like Prometheus, Grafana, and Zabbix.

  • Gain experience working with SonarQube and understanding code quality tools.

  • Learn to troubleshoot and support Linux and Windows-based infrastructure.

  • Mentorship and guidance from experienced professionals.

  • Guidance on handling system issues and improving uptime.

  • Insights into incident response, performance tuning, and system resilience.


Qualifications


  • Pursuing or recently completed a bachelor's degree in computer science, IT, or related field.

  • Familiarity with Linux systems, basic networking concepts, and Bash scripting.

  • Enthusiastic about learning DevOps automation tools.

  • Good communication skills, willingness to take initiative, and a problem-solving mindset.


Preferred Skills (Optional)


  • Basic knowledge of Docker, Git, or CI/CD pipelines.

  • Understanding of monitoring concepts and system health metrics.

  • Interest in security and working with InfoSec teams.