Skip to content

Site Reliability Engineer (m/f/d)

Hybrid
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia, Bulgaria
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia, Bulgaria
+4 more
IT, Systems and Infrastructure

Job description

Chaos is the world’s largest 3D visualization software company, and for over 20 years has empowered artists, designers, and architects to visualize anything they can imagine. Chaos offers intuitive and powerful workflows for creatives across the entire design spectrum, including architecture, engineering, construction, product design, manufacturing, and media and entertainment. Research and development at Chaos is leading the way towards a truly comprehensive end-to-end visualization ecosystem to meet the evolving needs of existing and new customers. In 2022 Chaos merged with Enscape and acquired Cylindo. For more information, visit chaos.com, enscape3d.com, and cylindo.com.

Site Reliability Engineer


***This position is based in Sofia, Bulgaria with hybrid working options. Applicants must have a work/residence permit for the respective location.***


Main Responsibilities:

  • We build and operate our Cloud Infrastructure on GCP and ensure high availability, horizontal scalability, and high-level automation.

  • Utilize Stackdriver for monitoring, logging, and debugging.

  • Create and support our monitoring structures using Prometheus and Grafana to solve problems before they appear and enhance system performance.

  • Work with development teams to ensure a high level of reliability for our systems through automated service usage pattern management, performance optimization, and fault management.

  • Execute defined ERM on the incidents, including but not limited to troubleshooting, communication, and the post-incident review of the event.

  • Help junior SRE team members improve and teach them the best reliability engineering practices.

  • Continuously assess new technologies and tools on the market to optimize and strengthen the architecture of our platform.

  • Manage and operate Kubernetes workload following best practices.

Job requirements


  • Five years of hands-on experience in reliability engineering is required, and working knowledge of SRE’s principles is a plus.

  • Skilled in Google Cloud Platform products and administration.

  • Knowledge of monitoring tools, including Stackdriver, Prometheus, and Grafana.

  • Experience with SQL and NoSQL database bottleneck analysis, performance tuning, and optimization.

  • Familiar with Sentry (sentry.io) as an issue-tracking tool.

  • Considerable understanding and experience with programming/scripting languages like Go, Python, etc.

  • Competence with infrastructure as code capabilities (Terraform, Ansible, etc.).

  • Documented design record that specifies the development of problems that are scalable, highly available, and fault-tolerant systems in GCP.

  • Good analytical skills, problem-solving ability, and teamwork.

  • Experience in running large infrastructure deployments and enhancements from scratch.

  • A bachelor’s degree in Computer Science, Computer Engineering, or another area.


Why Chaos?

• Working for a globally recognized company for its cutting edge products and honored with Academy Award for its contribution to motion pictures
• Working alongside talented people in an environment which fosters learning and knowledge-sharing
• Supplemental health insurance
• Flexible working hours and additional days off
• Competitive remuneration package
• Technical training and certifications
• Play and relax area in the office
• Special Discounts

We welcome people who value teamwork, stick to their agreements and are curious to explore new ways for achieving mastery.
If you think your profile is a good match for this role at Chaos, send us your resume and projects you have worked on.

Please make sure you get familiar with our Privacy Notice before you apply for the job.

Only short-listed candidates will be contacted.

Confidentiality of all applications is assured.

Hybrid
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia, Bulgaria
  • Sofia, Sofia (stolitsa), Bulgaria
  • Sofia, Sofia, Bulgaria
+4 more
IT, Systems and Infrastructure

or