Responsibility and Authority:
• Recommend / Drive and implement continuous improvement IT initiatives which contribute towards business strategy and improved growth
• Identify, communicate, manage, risk associated with existing IT environment presenting a risk to Operational stability
• Work with Datacenter team to drive cluster needs
• Enable development team by providing automated build and test solutions in simulation environments using AWS, Docker, and physical deep learning machines
• Maintain version control schemas to track development, staging, and production code using git/BitBucket
• Collaborate with multiple teams and domain experts to integrate multiple platform via designated API
• Work with third party service/support providers for deployments, system upgrades and other BAU/Projects related tasks, etc
• Diagnose and resolve occurring, latent and systemic reliability issues across entire stack: hardware, software, application and network
• Provide periodic performance reports
• Ensure that IT Policies and procedures are implemented, whilst presenting benefits to local Business community.
• Drive implementation of IT standard solutions and reduce locally managed services
• Ensure documentation of all maintenance contracts and local application vendors, identifying ways to integrate with standard (or in country) services to optimise cost efficiency.
• Understand IT cost for services delivered, and articulate to local business (with assistance) if required. Look for ways to drive cost reduction.
• Understand high risk Operational Infrastructure issues and make recommendations to resolve.
• Lead local Infrastructure projects where necessary implementing within budget and timeline requirements.
• Participate in supplier performance management for Infrastructure services to SLA requirements.
• To run operations according to the Code of Conduct, Environment Health & Safety (EHS) and Quality Policies, and work according to the Quality Management System. If the operation is EHS certified, work according to the EHS Management System
• 5+ years IT Hosting Support experience
• 2+ years DB experience for MS SQL, MySQL, Oracle, etc
• Solid technical foundation in automation, hybrid cloud infrastructure operation and orchestration, including experience with at least one orchestration system (Aurora, Kubernetes, etc.)
• Experience with cloud automation tools(Ansible, Terraform, etc)
• Experience with microservices
• AWS: EC2, S3, RDS, ECS, Cloudfront, VPC or equivalents in Azure, Alicloud, etc.
• Programming: Python, Bash, Shell, Java would be a plus
• Linux: excellent knowledge on Container and Docker, package management, system
• Networking: Firewall, WAF, Proxy and policy-based routing
• Excellent data analysis skills and the demonstrated ability to solve complex issues involving multiple software or hardware components
• Understanding IT Infrastructure Techniques (Cloud Ops and On-Prem Ops) and how to apply to achieve Business Growth
• Self-motivated and able to work at remote location with Business community
• Excellent communication skill and fluent in English for both oral and written
• Bachelor’s Degree in Computer Science, Computer Engineering, Information Technology or related technical discipline with 5+ years of working experience
If you want to get more job opportunities, please reach out (如果你想了解更多的职位机会，请点击): https://www.rgf-professional.com.cn/zh/jobs