11 January 2011

GoDaddy - 1st downtime

I was transferring some files from the VPS to my dedicated servers... when the connection just dropped. Now i have 2 ideas in mind:

1. GoDaddy closed my VPS for some reasons
2. The dedicated server where my VPS is hosted died for some other reasons

Whatever it is, i just hope that they will keep my 99% availability agreed when i bought the VPS...

Update 1: Server is back online (06:47 AM) after a downtime of 17 minutes.
Update 2: Server is down again (06:49 AM). I'm going to sleep, my dedicated servers will monitor this...

4 comments:

  1. discussion with GoDaddy support... i'm stefan:

    Joseph N. - Server Concierge: Please note: All Live Chat sessions are logged and may be monitored for security and quality assurance purposes.

    Thank you for contacting Live Chat support for Virtual and Dedicated Servers. This is Joseph. How can I help you?
    stefan: Morning... my server is not responding. Do you have any idea why?
    Joseph N. - Server Concierge: Lots of things can cause this, tried a reboot yet?
    stefan: no
    stefan: i don't want to reboot my vps
    stefan: it's a network issue
    Joseph N. - Server Concierge: reboot is a first step to resolving unresponsiveness.
    stefan: i'm doing MTR from 3 different locations and all packets are stopping at ip-208-109-115-170.ip.secureserver.net
    stefan: reboot is a first step to resolve everything on windows... on linux i need an explanation :)
    Joseph N. - Server Concierge: That is normal traceroute behavior, we set it like that.
    stefan: not really... about 15 minutes ago MTR was stopping at my server...
    Joseph N. - Server Concierge: did the trace go to complete?
    stefan: 15 minutes ago.. yes
    stefan: now it's up again
    stefan: strange... everything seems ok...
    stefan: it must be a network error, as the VPS has the same uptime, so it wasn't down
    stefan: now it's down again :))
    stefan: are you still there ?
    Joseph N. - Server Concierge: Looking.
    stefan: ok...
    Joseph N. - Server Concierge: The server isnt online, give it a reboot, it usually celars this up, if it does not, let us know in 30 minutes.
    stefan: why reboot and why 30 minutes? :)
    stefan: this is clearly not a server issue... from my point of view
    stefan: let me ask you something else
    stefan: is there any limit of data that i can transfer for 1-2-3-4-5 minutes ?
    stefan: from/to my VPS
    Joseph N. - Server Concierge: There is not.
    stefan: strange...
    stefan: except reboot, do you have other ideas ?
    Joseph N. - Server Concierge: considering icant log into anything, not at the moment.
    stefan: but you can login to the dedicated server where the VPS is located, right?
    Joseph N. - Server Concierge: I cannot regrettably.
    stefan: so... you're asking me to do a reboot to a server that you can't access... and then wait 30 minutes? :))
    stefan: this is fun
    Joseph N. - Server Concierge: A reboot fixes most of these issues, if its stil ldown in 30 minutes, we will send a ticket up for administrator review.
    stefan: ok... can i ask why 30 minutes instead of 5 ?
    Joseph N. - Server Concierge: its usually a few minutes, sometimes max of 30.
    stefan: ok... can you reboot my VPS? or should i request a power cycle via web interface ?
    Joseph N. - Server Concierge: Id do the same thing.
    Joseph N. - Server Concierge: so either way
    stefan: please do it... it's easier to speak with a person than a PC :)
    Joseph N. - Server Concierge: Your power cycle request has been submitted. The power cycle will take up to 30 minutes to complete. If your server is not accessible after this time, please contact us so that we may investigate the issue.
    stefan: great... that's why 30 minutes
    stefan: well, i guess i'll be back if the issue is still present
    stefan: thanks for your time for now ;)
    Joseph N. - Server Concierge: Thanks again for using Live Chat; have a great evening.
    stefan: (it's 7:18 AM here)
    stefan: hava a great evening u2
    stefan: bye

    ReplyDelete
  2. discussion with GoDaddy support - part 2

    Thank you for contacting Live Chat support for Virtual and Dedicated Servers. This is Michaela. How can I help you?
    stefan: Morning :)
    stefan: i have a problem with my VPS starting from this morning
    Michaela Z. - Server Concierge: What is the issue.
    stefan: every time when i SCP some files out of my VPS, after couple of megabytes, my VPS is down...
    stefan: one of your colleagues asked me to reboot it... i did that, but it didn't help
    Michaela Z. - Server Concierge: Are you seeing any errors in the error logs?
    stefan: i asked him if there are any file transfer limits, but he told me not
    stefan: are there any error logs for the SCP transfers?
    stefan: or what logs are you referring to ?
    Michaela Z. - Server Concierge: I am not familiar with that. I was referring to the main error logs for clues as to why the server is becoming unavialbale.
    Michaela Z. - Server Concierge: sorry, "unavailable"
    stefan: no, i didn't check those... i have a script which is transferring the files, so i manage to get access to my VPS for one or two minutes... then it's down again for 5 minutes or so
    stefan: now it's up for example
    Michaela Z. - Server Concierge: If it keeps going up and down I would review your error logs for why it is down.
    stefan: there is nothing in the logs
    stefan: i just checked
    stefan: however, my script returns the following:
    stefan: lost connection
    ssh: ush.ro: Temporary failure in name resolution
    lost connection
    ssh: ush.ro: Temporary failure in name resolution
    stefan: ush.ro - is my dedicated server (which is available)
    Michaela Z. - Server Concierge: Try using the IP in your script.
    stefan: server is down again
    stefan: i configured bind as cache DNS... and all requests are going via it, so there is no DNS issue
    stefan: is like there is a limit of file transfer on your host
    stefan: and i'm reaching that limit...
    stefan: after 5 minutes or so, the VPS comes up without any issues and it's working just fine
    Michaela Z. - Server Concierge: "Temporary failure in name resolution" suggests issues with DNS. What happens when you try with the IP?
    stefan: i will tell you in 5-10 minutes after the VPS is up again
    stefan: as an idea, if i can transfer 10-20 files to the same host, then i get an error in name resolution and the vps is down... i guess it's not DNS issue

    ReplyDelete
  3. ... continuing due to google message limit...

    stefan: can you ping my VPS ? or access pictures4.net ?
    Michaela Z. - Server Concierge: I am not able to ping it or view the site.
    Michaela Z. - Server Concierge: Has it been coming back up on its own?
    stefan: yes
    stefan: i don't have access to it now... so just waiting is auto-fixing the issue somehow
    stefan: any other ideas?
    Michaela Z. - Server Concierge: I would ensure logging is running and check the logs. Ususally when a server goes down and comes back up there is something in the logs.
    stefan: logging is running, because i stopped some services and i could see them there
    stefan: for example: Jan 11 00:13:17 ip-188-121-37-169 saslauthd[12172]: server_exit : master exited: 12172
    stefan: this is normal, as i topped the saslauthd
    Michaela Z. - Server Concierge: Okay.
    stefan: so....
    Michaela Z. - Server Concierge: Can you provide steps to duplciate this issue in a trouble ticket so that we may review further?
    stefan: it's auto replicable... as i have ~2gb of files to transfer from my VPS to my dedicated server
    stefan: just ping the server and you'll see how it's going up and down each couple of minutes
    Michaela Z. - Server Concierge: We need steps to be able to see the cause of it going down.
    Michaela Z. - Server Concierge: Not just that is is down.
    stefan: ok... here are the steps
    stefan: 1. login to the server
    stefan: 2. do as root: screen -r (to see the script output)
    stefan: 3. wait...
    stefan: server is up again
    stefan: i started the script again with ip instead of hostname as you sugested... should go down any minute now :)
    Michaela Z. - Server Concierge: To Create a Trouble Ticket
    ...
    stefan: ok :)
    stefan: it's down again :))
    stefan: well... thanks for not asking me to reboot the server
    stefan: i'll open the trouble ticket
    Michaela Z. - Server Concierge: Thanks.
    Michaela Z. - Server Concierge: Is there anything else I can help you with?
    stefan: really anything or only server related? :P
    stefan: no, that's ok... thanks ;)
    Michaela Z. - Server Concierge: Thanks again for using Live Chat; have a great day.
    stefan: take care, bye
    Your session has ended. You may now close this window.

    ReplyDelete
  4. Please wait while we find an agent to assist you.....Thank you for your patience.
    You are currently at position number 1 in the queue. Thank you for your patience.
    You have been connected to Michaela Z. - Server Concierge.
    Michaela Z. - Server Concierge: Please note: All Live Chat sessions are logged and may be monitored for security and quality assurance purposes.

    Thank you for contacting Live Chat support for Virtual and Dedicated Servers. This is Michaela. How can I help you?
    stefan: morning Michaela
    stefan: we just spoke like one hour ago regarding my VPS which is getting down
    Michaela Z. - Server Concierge: Okay.
    stefan: i created a trouble ticket... actually 2
    stefan: how can i check the status of them?
    Michaela Z. - Server Concierge: They are still open.
    Michaela Z. - Server Concierge: Tickets can take up to 72 hours. Once they are resolved we email you.
    stefan: ahh... ok
    stefan: can i put an update on them?
    Michaela Z. - Server Concierge: If you provide the ticket number and what you wish to have added to them I can update them for you.
    stefan: but there is no way for me to actually update them...
    stefan: i was just asking :)
    stefan: well... it seems that i just have to wait until it's solved...
    Michaela Z. - Server Concierge: You cannot update them.
    Michaela Z. - Server Concierge: I am sorry.
    stefan: no worries... i'm sure it's not your fault :)
    stefan: is my 99% uptime still guaranteed ?
    Michaela Z. - Server Concierge: We only offer 99% uptime on our network not on the actual server.
    stefan: can you elaborate on that ?
    Michaela Z. - Server Concierge: You have access to the server and can make configuration changes. As we do not have sole control of the environment, we cannot guarantee that it will not go down. However we can guarantee our network.
    stefan: but now i don't have access to my server
    stefan: and i can't make configuration changes...
    Michaela Z. - Server Concierge: Is your server currently down?
    stefan: yes
    stefan: for exactly 15 minutes...
    stefan: if you check my 2nd trouble ticket, you will see that it stays up for couple of minutes... then it goes down for 15 minutes
    stefan: this happens only when i have my sync process started
    stefan: now just went up
    Michaela Z. - Server Concierge: And it keeps coming back up and then the sync process causes it to go down again.
    stefan: exactly... why would a SCP command kill one server for exactly 15 minutes ?
    Michaela Z. - Server Concierge: As I advised before you can review the logs or allow us to review the server on the ticket. After it comes back up if you run the "uptime" command what do you get?
    stefan: i've added a 5 seconds delay between each transfer... i'm wondering if it will help
    stefan: up 2:41... but this is not relevant
    Michaela Z. - Server Concierge: I am trying to determine if the server is actually going down or just not available.
    stefan: just not available... like the network is down (but only between my VPS and the master host)
    Michaela Z. - Server Concierge: Okay. Once we have more information we will email you on your ticket.
    stefan: ok... there are 9,7 more megabytes to be transferred
    stefan: very small files... so this should happen in couple of minutes
    stefan: i guess after the SCP will be finished, the issue will disapear
    stefan: but if i put my website in production... then i can't afford 15 minutes of downtime each 2-5 minutes
    Michaela Z. - Server Concierge: We will look into the issue with the trouble ticket.
    stefan: ok
    stefan: well thanks for now
    stefan: i'll be back in case of other issues
    stefan: take care
    Michaela Z. - Server Concierge: Thanks again for using Live Chat; have a great day.
    Your session has ended. You may now close this window.

    ReplyDelete