Send me more jobs like this

Site Reliability Engineer

Keywords / Skills : Site Reliability Engineer, Production Support, Docker, Java, Python, Ruby, Ansible, Chef, Packer, serverless, Kubernetes

5 - 10 years
Posted: 2019-01-14

IT/ Computers - Software
Team Leader/ Technical Leader
Posted On
14th Jan 2019
Job Ref code
Job Description
Job Description:

Position:- Site Reliability Engineer
Experience:- 5+ Years
Work Location:- Hyderabad

Responsibilities :
  • Engage in and improve the whole lifecycle of services- from inception and design, through deployment, operation and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless post-mortems.
  • Detection - proactive detection of issues; publish SLA/metrics post code releases
  • Identify Trends and patterns based on genetic mechanisms like response time > 5secs, traffic patterns etc.
  • Root Cause analysis of issues related to above components; responsible for connecting the dots on major incidents
Requirements :
  • Degree in Computer Science or related field and 8 years of relevant experience
  • 5+ years of Systems/Applications automation in 24x7 Production Services environments
  • Experience using deploying and utilizing any cloud platform. Preferably Google Cloud
  • Experience implementing, designing, deploying Docker, Kubernetes, Serverless (Lambda- s)
  • Hands on experience in Unix/Linux from kernel to shell, file systems, client- server protocols ability to read a packet capture/tcpdump, etc.
  • Solid programming experience in one or more Java, Python, GoLang, Ruby
  • Experience in designing, analysing, and troubleshooting large-scale distributed systems
  • Experience with one or more orchestration, deployment tools CloudFormation, Terraform, Ansible, Packer, Chef
  • Experience with one or more CI tools Jenkins, TeamCity, Bamboo, Artifactory
  • Experience with monitoring alerting using technologies like Splunk, Nagios, , DataDog, PagerDuty, DynaTrace etc
  • Through understanding of SLO/SAL/SLIs and error budgeting
  • Certifications in Google Cloud Platform is a huge plus

About Company

GSPANN Technologies Inc.GSPANN is a US California Bay Area based consulting services provider focused on implementations in the Enterprise Content Management, Business Intelligence & Mobile Solution initiatives. More than 90% of our current clientele are FORTUNE 1000 organizations. We specialize in strategy, architecture, delivery and support of solutions in the ECM, BI and Mobility space. Since 2004, our consultants have successfully served over 150 fortune 500 clients. We have implanted solutions in Enterprise Content Management, Business Intelligence, Data Warehousing, Information Management, Big Data, Mobility, QA Automation & Project Management areas. Good mix of subject matter experts, project managers, senior and junior consultants. We have a successful & proven cost-effective onsite, offsite and off-shore model. Our consultants helps our Clients with Strategy development, Architecture build outs, Delivery Services and Post Delivery Services (maintenance & support). Our Specialties: Enterprise Content Mgmt, Information Mgmt (BI and DW), Web Strategy and Solution, QA Automation, Creative and Design Mgmt, Interactive Marketing solutions, Content Authoring and Migration, 24 X 7 Application Support, Mobile App Development, Omni Channel Strategy
Similar Jobs
View All Similar Jobs

Walkin for you