- As a Platform Operations & Site Reliability Engineer, you will be part of a highly skilled cross-functional team of individuals delivering modern application platforms as a managed service for the United States Air Force Air Operations Center (AOC).
- The Platform Operations & Site Reliability Engineer position is responsible for installation, maintenance, upgrades, monitoring, and management of Cloud Foundry and Kubernetes technologies as a managed service for the AOC enterprise in production operational environments. Working in a cross-functional team, you will be responsible for the entire deployment and operations lifecycle of these and supporting technologies to include their integration with external enterprise core services. All these technologies will be hosted in a VMWare Vsphere environment, therefore experience working with VMWare is necessary.
- The role of Platform Operations Engineer will require working very closely with the infrastructure, network and security operations team across one or many geographic theaters of AOC operations to ensure platforms in these distributed environments on classified and unclassified networks are stable, reliable, and always available for applications to run.
- As Platform Operations Engineers, you will also support the application development teams who are building software products for AOC mission. You will share the status of these environments, answer questions about these environments, available services and their configuration, and offer recommendations on operational aspects that applications need to accomodate in their architecture. You will also implement continuous integration and continuous delivery processes to streamline the delivery of apps to production environments while aspiring to a culture of continuous process improvements.
- This core team will be based out of AOC Global Platform Ops Center located at Langley AFB in Hamptons, VA. The work will include limited travel for on the job enablement and occasional site support to different areas of the AOC operational theaters when necessary.
This role will include, but is not limited to tasks like the following:
● Create, implement and manage Pivotal Cloud Foundry deployments in a production environment to ensure high availability and reliability.
● Provide expertise for operational management of Pivotal Cloud Foundry on monitoring, management, disaster recovery, security compliance/auditing, networking & storage
● Develop and deliver configuration and deployment automation required for improving the functionality, availability, and manageability of platform and associated services.
● Integrate platform with external enterprise core services such as Active Directory Federation Services, DNS etc.
● Work with Service Desk, Incident Management, Configuration Management and Change Management Processes for the designated environments.
● Set up administrator and service accounts & maintaining upto to date documentation
● Works closely with developers and hardware personnel in troubleshooting issues.
● Monitor usage of infrastructure resources and capacity planning for platform uptime.