Site Reliability Engineer (SRE)

#
TechStack
  • Appdynamics
  • Alertisite
  • Nagios
  • Grafana
  • Prometheus
  • Kibana
  • Datadog
  • Cloudwatch
  • GitLab
  • SVN
  • Bitbucket
  • RemedyForce
  • Pagerduty
  • Microservices
  • API
ABOUT the job

Workflow is an extensive platform that unifies many web and mobile applications.
As an SRE you will be responsible for 1 or 2 applications within the Workflow platform and working with the corresponding development team.

Key Areas of Focus:
• Reducing Technical Debt
• Reducing Toil
• Observability/System Monitoring
• Incident Response throughout SDLC
• Problem Management

First 6 months in the position:
• Cleanup work, bug fixing, preparing the basis for the future SRE work
• Apply automation to any tasks/parts of the system that are performed manually
• Configuring and maintaining the monitoring tooling as it relates to the target application
• Monitor application/infrastructure and take steps to improve overall system software performance, availability, and reliability by incorporating changes through defined feedback loops within the software delivery lifecycle
• Document tribal knowledge as you acquire it over time by creating runbooks/playbooks and ensuring critical system information is readily available to those who need it through dashboards

After the first 6 months in the position:
• Work closely with software developers and testers to ensure the product is responding correctly to non-functional requirements such as security, performance, and availability
• Resolve NOC escalations and help prevent reiteration of incidents by creating processes and automation
• Be key part of our response to high-severity internal customer incidents, ensuring we meet all SLAs and SLOs
• Help build an SRE culture by sharing best practices, approaches, documentation, and code with other engineering teams across the organization
• Assist product development team with managing their error budget
• Embrace failures and treat incidents as learning opportunities through conducting blameless postmortems reports
• Participate in product engineering stand-ups and related design activities
• Coach other team members to ensure systems are supported by following SRE best practices

Job Location

Remote (Hungary, Poland)

ABOUT THE COMPANY

This company is top player in the vehicle lifecycle game! They're all about helping the people who make, insure, repair, and replace cars step up their transportation game using some seriously rad tech, like mobile, artificial intelligence, and connected car stuff. They've built a huge network of over 350 insurance companies, 24,000 repair facilities, OEMs, tons of parts suppliers, and other data and service providers to help their clients make better decisions, work faster, and create an awesome experience for their customers. And get this – they're a pretty big deal! They were ranked #17 in the Top 100 Digital Companies in Chicago in 2020 by Built in Chicago (which is like THE online community for digital tech entrepreneurs in Chicago), and Forbes named them one of the best mid-sized companies to work for in 2019. With over 2,600 full-time employees (plus 350 contractors), they're keeping things real in their sweet downtown Chicago headquarters at the historic Merchandise Mart building. Plus, it's totally eco-friendly – it's LEED certified and a total tech hub in the city. We've won some pretty sweet awards too - like the Innovation Championship by Zurich, where we snagged 1st place out of 1,300 solutions from all over the world. We also won the Global Silver Award for Innovation in Insurance out of 359 innovations from 45 countries. And to top it all off, we were voted one of "the 3 best innovations at a global level" in InsurTech. Plus, Plug and Play Insurance Partners voted us as the #1 InsurTech. We're pretty proud of all that!

ABOUT the candidate

We're on the hunt for a top-notch Site Reliability Engineer (SRE) to join our product development team! As our SRE, you'll be the go-to person for ensuring our applications run smoothly and are always available for our users. You're a master troubleshooter who loves getting to the bottom of any problem that pops up, fixing it, and making sure our teams learn from it. If you're passionate about keeping things running smoothly and thrive in a fast-paced environment, then we want you on our team!

Requirements

• Experience with monitoring and data visualization tools: Appdynamics, Alertsite, Nagios, Grafana, Prometheus, Kibana, Datadog, any cloud native monitoring services such as Cloudwatch
• Experience with source code management tools: Github, GitLab, SVN, Bitbucket
• Experience with incident management tools: RemedyForce, Pagerduty
• Experience with collaboration tools: Teams, Confluence, Microsoft Office 365
• Experience with project management: Version One, JIRA
• Solid understanding of microservices and APIs
• Being versed in system management, monitoring, and analysis to identify opportunities for improving service health, manageability, and reliability
• Proven ability to dig through metrics, logs, and available sources to triage and resolve an incident at any time
• Eager to problem-solve and troubleshoot issues that may arise day-to-day
• Ability to document solutions, SRE architectural patterns, and best practices to ensure that teams have guidance as needed
• Experience and interest in working in an Agile environment
• Effective communication and interpersonal skills

Nice To Have Skills
  • DevOps
Benefits

When you join our stellar team, you'll get tons of cool benefits, like:

• Building your skills with our Client Engagement team, who can help with all kinds of projects.
• Joining our awesome community of like-minded folks.
• Becoming a mentor or speaker and getting rewarded for it – both emotionally and financially!
• Attending meetups as a speaker or listener to learn and grow.
• We're all about broadening our horizons and sharing knowledge – so don't be afraid to ask questions and get curious!

apply

CONTRACT TERMS

This is a full-time job opportunity, where you’d be working on projects lasting 12 months on average.  At the end of the period, you will be able to continue being a Pro Consultant by getting assigned to another exciting project. The continuity of your permanent employment with all social and additional benefits included is guaranteed by Motion Software.

ATANAS ATANASOV
Senior Software Engineer
Motion Software provides a unique work environment that allows for remote/hybrid working, providing the best of both worlds. Projects that I've worked on are both exciting and challenging and have helped me grow both professionally and personally. The company frequently organizes team-building events and creates a fun and energetic work environment that fosters camaraderie and collaboration among employees. I like that people in Motion Software are fun, easy-going and very active. Working in Motion Software feels a lot more like a cool gathering with your friends, than just a job.
VICTOR VICTOROV
Full-Stack Developer
Be able to work from any point in the world. Friendly and communicative team members and crew. Be able to speak freely and open to anyone from the company. Helpful and understanding staff and members.
MARIYA TSVETANOVA
Remote Work Advisor
Fully remote, flex hours, great benefits and community around the company. Great working place for people with different lifestyles, mum - friendly and with a great vibe.

WHY MOTION SOFTWARE?

SEE THE BENEFITS

Motion Software uses cookies to improve site functionality, provide you with a better browsing experience, and to enable our partners to advertise to you. Detailed information on the use of cookies on this Site, and how you can decline them, is provided in our Cookie Policy Learn more about cookies, Opens in new tab. By using this Site or clicking on OK, you consent to the use of cookies.

OK