Affecting Other - Network
We received notification from our internal monitoring several minutes ago of a network issue in our Allentown, Pennsylvania location. We are currently failing over to our backup circuit which will route traffic through our Liberty Lake, Washington location. Until the issue is resolved, additional latency may be experienced since traffic is now currently being hauled through a non-ideal path while the issue with the local circuit is under review.
UPDATE: The issue was upstream, and has since been resolved. We're now routing traffic as normal again, so latency and speeds should return to normal now. Closing this issue out now.
Affecting System - EU-NL-08
UPDATE 11/24/2023:
This has occurred again so we are having the datacenter replace the power cable and PDU port that this server is connected to. Will update shortly.
UPDATE:
The PDU cable has been replaced and ran to another PDU within the same rack. All VMs are coming back up right now.
----
The EU-NL-08 VPS node was reported by internal monitoring as offline. Upon review, it appeared as powered off with the datacenter. It was promptly powered back on and each VPS is now currently in the state of coming back online.
We are investigating this further, including reaching out to the datacenter to have them review their end. Internal monitoring on our end show no hardware concerns or resource constraints that'd force a physical node to power off. At a glance, we observe no high temps, the CPU idles at about 35% under normal constant use, no storage or array alerts, etc.
Will report back when there is more.
UPDATE:
The iDRAC logs didn't show any sort of power outage, and the PDU port that the server is connected to only shows a 'power on' entry (us powering the server back on) in it's logs, no 'power off'. Datacenter staff checked to make sure the power cable was firmly connected, and it is. We may schedule a PDU cable replacement and have it ran to a different PDU in the same rack in the chance that either of these are or were defective.
Affecting Server - nl-01.incoghost.com
Be advised that we are migrating shared hosting clients in the Netherlands to new hardware. This is part of a larger network and infrastructure upgrade in this location. We will post any related updates regarding this migration here.
EDIT: Migration is complete, with our monitoring showing 1 hour and 5 minutes of downtime.
Affecting Other - Network
Hello,
We've experienced a couple short outages in our US-West (Liberty Lake, Washington) location today. Our datacenter is currently replacing some network hardware as this issue is at their core and impacting multiple customers, not just us. We will update this once we have received an update from them with an incident report.
From our datacenter:
INCIDENT REPORT
Description: Router hardware issues and emergency repairs on October 13, 2023 at approximately the following times:
- 11:03PST until 11:04PST
- 15:12PST until 15:13PST
- 16:12PST until 16:24PST
- 16:28PST until 16:32PST
Root cause: Our master routing engine within edge1.LBLKWA experienced an unplanned reboot, accompanied by an unusual error code at 11:03PST. Subsequently, the backup routing engine assumed control, and by 11:04PST network traffic resumed following the re-establishment of BGP sessions. To diagnose and resolve this unexpected issue, our NOC immediately engaged with a Juniper JTAC representative. The advice received was to pursue two critical actions: a firmware upgrade on both routing engines and the replacement of the problematic routing engine and control board. Notably, the error message exhibited by the router was abnormal and had not been seen by JTAC.
Resolution: Our NOC diligently executed the recommended measures, which included the firmware upgrades and the replacement of the malfunctioning routing engine and additional control board in our core router. During these procedures, there was a series of brief outages as the mastership role transitioned between the master and backup routing engines, leading to momentary disruptions in BGP sessions which subsequently re-established. After the repairs were completed network traffic stability and core redundancy within our edge infrastructure were re-established.
We understand that this incident caused inconveniences to our customers, and we do sincerely apologize for the disruptions it caused. We will immediately restock our spare/replacement critical hardware and continue to actively monitor for any signs of trouble.
Affecting System - DNS (NS1/NS2)
On 06/28/2023 both the NS1 and NS2 nameserver failed, resulting in a DNS outage for our shared web hosting customers in all locations.
There was a delayed response our review of this due to a mishap with an internal notification system.
When discovered, it appeared that both our individual nameservers that are used by our shared hosting customers had ran out of disk space. The same level of monitoring used on our other in-production servers that would typically alert us of this was not applied to these name servers, as the monitoring we had before was to only check that their IPs were responding to ping (which they both were).
Once this was resolved, we re-synced the DNS records/zones from the individual shared hosting servers in the US and EU which allowed domain names to resolve again.
We’ve added some additional monitoring and safeguards to prevent this from ever occurring again as well as created new monitoring rules for alerts.
Impacted customers can contact us for service credit.
The NL-01 VPS node in the Netherlands is currently down. We are working with the datacenter to bring it back as it's not responding to remote commands. Internal monitoring doesn't indicate a hardware failure of any sort at this time, however we will continue to investigate and will bring this back up as soon as possible.
Update 1: Internal monitoring showed a large spike in CPU temperature right before the server went offline. The datacenter is reviewing it now and will take appropriate action. More updates to come as they're known to us.
Update 2: Datacenter states the CPU fan has failed and are replacing it. Should be resolved soon.
Update 3: The datacenter has replaced the CPU cooler, the VPS node has been booted back up and is now once again reachable. We'll continue to monitor for any signs of potential failure or overheating.
Affecting Other - CDA Network
We are aware of a service impacting issue in our USA location in Idaho and are investigating and reviewing it.
Affecting System - VPS Control Panel
We're aware of impossibly slow OS installs / reinstalls for most VPS nodes. OS templates are stored on the master server which is outside our network, with a 3rd party, and we're experiencing pretty terrible network conditions right now. This may delay or prevent OS reinstalls for VPS customers
We host some of our public facing infrastructure outside of our own network so that not all eggs are in the same basket. Things like our website mirrors, portal, VPS control, DNS cluster, etc get spread out so that in the event of network issues on our end, main points of contact and access aren't impacted by it.
We will be re-doing the setup in the near future, but for now, we wait for them to fix their network issues.
EDIT: It's not fully resolved, but greatly improved. The issue is on Lumen's network and we have faith that fully performance will be restored soon.
Affecting Other - IDAHO USA KVM VPS
We are aware of network issues impacting the stability of connectivity in our Idaho, USA location. On Monday (01/30/2023) we will replacing a switch to correct the issues currently being experienced.
Affecting System - VPS Control Panel
Be advised that we are undergoing some planned maintenance of the customer facing VPS control panel. This upgrade moves the system onto dedicated hardware for some increased performance as well as on a network offering superior protection against DDoS attacks. This update also allows us to increase the number of offered OS templates and ISOs available for your use.
During this period you may encounter issues logging into your VPS control panel or see SSL errors. This should all be resolved soon, most of this is caused by the DNS updating.
Contact us via the helpdesk if you encounter any issues.
Affecting Other - NL-02
UPDATE: 3:20PM EST
It would appear at this time that replacing the motherboard did the trick. The node has been stable since the emergency maintenance and as such this issue is being closed. We are still monitoring the situation and still plan to have additional capacity within the next 24 hours to accommodate migrations if need be.
Thanks for your patience if you were impacted by this today, or previous stability issues on the NL-02 node in the past month.
UPDATE: 11:42AM EST
Within the last hour the motherboard has been replaced and early observations show things have stabilized. All VMs are online, no data has been lost, and we are still monitoring the node for further issues. Furthermore, we will have additional capacity within the next 24 hours to accommodate a migration of all containers if needed if it proves that the motherboard replacement does not solve the issue.
Not exactly how we wanted to kick off the New Year. ( -_-)
(Downgrading priority to Medium)
09:54AM EST
The Netherlands based KVM host node NL-02 is experiencing service impacting issues. We are awaiting for the delivery and setup of a new hardware node and will undergo emergency maintenance to bring the server to a stable condition. In previous weeks we have completed a thorough checklist of tests, including replacing RAM, yet the issue persists. Datacenter staff are scheduling a replacement of the motherboard.
Right now the physical hardware node is rebooting sporadically which is preventing emergency migration efforts to other available nodes. Once service is stabilized we will contact impacted customers for options on moving forward, which include migrating to a new host node.
Thank you for patience.
Affecting System - nl-02.incogvps.com
UPDATE: 12/09/2021: Scheduling a memtest tonight which will cause some expected downtime. Will share results when the test is complete.
UPDATE: 12/08/2021: A review and test of the hardware has shown no obvious defects. A new hardware node will become available in the next week and we may migrate all VMs from NL-02 to the new node. We are reviewing options with our server provider to possible replace the PSU and do a more thorough check of the physical hardware after we have migrated customers off of it. Thank you for your patience.
On 12/04/2021 we rebooted the NL-02 host node after a period of software maintenance to diagnose an issue causing the containers on the host node to reboot every few days. After review, it appeared the kernel was crashing, so some updates were applied that we believed would fix this issue and we scheduled a reboot to apply the changes in hopes it would resolve the concern moving forward.
On 12/07/2021, the issue appeared to be unresolved. We have scheduled a full hardware check on the host node that will require some planned, scheduled downtime.
This scheduled hardware check will occur tomorrow, 12/08/2021 sometime after 8AM Eastern Standard Time. The estimated time of completion should be 30 - 120 minutes
Affecting System - nl-02.incogvps.com
Please be advised that we will be performing a purposeful and scheduled reboot of the NL-02 VPS node to correct an issue that was causing containers on the node to reboot unexpectedly. This scheduled rebooting of the hardware node should correct the issue, with no unexpected reboots of the virtual containers within moving forward being the desired and expected result.
UPDATE:
We will continue to monitor the node to ensure that the updates prevent the original issue that this maintenance period aims to correct.
Affecting System - VPS Control Panel
UPDATE: Maintenance has been re-scheduled to a future date and has not yet been completed. We will re-announce when this is to be completed.
When: 09/30/2021
Duration: 2 hours~
Why?:
In an effort to constantly improve our ability to serve you we are migrating our VPS Control panel ( https://control.incogvps.com:4083 ) to a new server. This new server will allow us to:
Potential Issues:
Affecting Other - DNS
We were alerted of an issue with our PowerDNS setup that syncs user submitted rDNS / PTR records from the Virtualizor Control Panel for VPS hostnames with our small DNS cluster. We believe the issue is now resolved.
Some users may be required to recreate their rDNS records via their control panel.
Customers using our Finland location will still need to ticket us for manual creation of their rDNS records. We own the IP's in our Netherlands location which allows us to automate the process, whereas in Finland they're leased. Shoot us a ticket and we'll be happy to get you setup.
Affecting Other - VPN Network
A configuration change on our behalf temporarily made this location unavailable. The issue was reported immediately via automatic alerting. The issue was found, corrected and resolved in about 10 minutes of time. During this time this VPN location/node was unavailable. Service has returned to normal now.