PostMortem

Victor Zuluaga
1 min readFeb 7, 2021

Context

On Monday 16 November 2020 (GMT-5). Working on servers, I had the responsibility of setting up, installing, and maintaining the webserver. Then, I installed Nginx web server, set it up, and installed all packages and dependencies required.

Issue Summary

On Tuesday 17 November 2020 (GMT-5) 9:00 AM start the issue detected, Newly tried login into a server. But, this was not responding. The network was unavailable for a duration of 4 hours 15minutes. Tuesday 17 November 2020 (GMT-5) 1:15 PM end. Now, I understand how important the flexibility to install new resources and scale up, is for our users, and apologize for this incident.

When I was trying to set OpenSSH-server connection, I’m having some issues connecting.

  • Error message: Connection timeout for ssh server.

Impact

The impact was the process to deploy a new application it set back. Fortunately, was an internal error and was detected before get out to production.

Root cause

The issue was triggered by the firewall, among the settings up that I did, enable the firewall, this by default no enable the port 22/TCP for SSH connections. Manually must indicate it to the firewall that allows port 22/tcp connection remotes via SSH. Otherwise, all attempts via SSH the firewall it will deny.

Remediation and prevention

To resolve the issue, the engineer had to destroy the server and start one again and setting it up and install it all again. Having special matter when it manipulates the firewall in a server.

--

--