tag:cloudgov.statuspage.io,2005:/historycloud.gov Status - Incident History2024-03-26T15:59:52-04:00cloud.govtag:cloudgov.statuspage.io,2005:Incident/199404332024-03-26T15:59:52-04:002024-03-26T15:59:52-04:00Out of memory issues in running and staging apps<p><small>Mar <var data-var='date'>26</var>, <var data-var='time'>15:59</var> EDT</small><br><strong>Resolved</strong> - Since rolling out stemcell version 1.404 to the platform last week, we have received no further reports of out of memory issues and our own internal metrics show a decline in these errors, so this incident is resolved.<br /><br />If you are still experiencing issues with your applications, please contact support@cloud.gov.</p><p><small>Mar <var data-var='date'>19</var>, <var data-var='time'>10:42</var> EDT</small><br><strong>Update</strong> - After further testing and debugging, members of the CloudFoundry community were able to isolate the cause of the "out of memory" issues to an incompatibility between Linux cgroups v1 and version 6.5 of the Linux kernel, both of which were used by the latest stemcells. cgroups are a process isolation mechanism often used to manage container processes, including customer applications running on cloud.gov.<br /><br />To fix the out of memory issues, the CloudFoundry community has released a new stemcell version, 1.404, which uses version 5.15 of the Ubuntu Jammy kernel. Version 5.15 is the long-term supported release of Ubuntu Jammy (https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle), so this release will continue to receive security patches and other fixes.<br /><br />We are rolling out stemcell version 1.404 to the platform today and expect to see a reduction in memory use across our platform, including the possible resolution of all "out of memory" issues for customer applications.<br /><br />Even though we only expect these changes to benefit our platform and our customers, we will still be closely monitoring our platform for stability as we roll out these changes. If you experience any issues, don't hesitate to contact us at support@cloud.gov.</p><p><small>Feb <var data-var='date'>20</var>, <var data-var='time'>12:04</var> EST</small><br><strong>Update</strong> - Since the changes that were deployed last week to increase the number of VMs available to host customer applications and to double the amount of memory available for staging operations, customers have reported a reduction in the frequency of "out of memory" issues, but are still experiencing them.<br /><br />The cloud.gov team has continued to investigate the cause of these issues. After consulting with the CloudFoundry community, we believe that these issues may be caused by faulty memory allocation in the Linux kernel which is built-in to the stemcells for CloudFoundry VMs. This GitHub issue is being used to track investigation and resolution of the stemcell memory issues: https://github.com/cloudfoundry/bosh-linux-stemcell-builder/issues/318<br /><br />One of the recommendations from the community to resolve the "out of memory" issues was to roll back to stemcell version 1.340, however every stemcell release includes fixes for a number of CVEs (https://github.com/cloudfoundry/bosh-linux-stemcell-builder/releases), so rolling back would expose the platform and our customers to CVEs that are patched in the current stemcell version. At this time, cloud.gov does not plan to roll back our stemcell version given the potential security risk.<br /><br />Another recommendation from the community was to increase the amount of memory available for staging, which we did last week when we increased the value from 1024 MB to 2048 MB, but customers continue to experience issues.<br /><br />At this point, the plan for mitigation is to pursue ad hoc memory increases for applications that are still experiencing issues until a fix for the kernel/stemcell issue is released from upstream and can be deployed to our platform.<br /><br />If your applications are still experiencing issues, please contact us at support@cloud.gov so we can work to resolve them for you.<br /><br />Thank you for being a cloud.gov customer!</p><p><small>Feb <var data-var='date'>13</var>, <var data-var='time'>14:37</var> EST</small><br><strong>Update</strong> - The cloud.gov team is continuing to investigate the causes of "out of memory" errors that are being seen for some customer applications.<br /><br />In order to address these errors, at approximately 10:48 AM ET, the cloud.gov team deployed two changes to our production environment:<br /><br />- Increased the number of VMs available to host customer applications<br />- Doubled the amount of memory available for staging applications from 1024 MB to 2048 MB<br /><br />Customers experiencing "out of memory" errors for their applications should try restaging their applications via 'cf restage' or 'cf restage --strategy rolling' to see if the issue is resolved. <br /><br />Please contact support@cloud.gov if you have further questions or concerns.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>13:40</var> EST</small><br><strong>Update</strong> - Due to an on-going security incident, we have temporarily paused internal cloud.gov platform deployments. This pause will continue to impact the time to resolution for the Out Of Memory (OOM) issue that we are still addressing. We are continuing to mitigate the OOM issue in the meantime. <br /><br />Please reach out to support@cloud.gov if you are experiencing issues with your applications and we will assist you with mitigations while we work to resolve these incidents.</p><p><small>Feb <var data-var='date'> 8</var>, <var data-var='time'>18:26</var> EST</small><br><strong>Monitoring</strong> - The cloud.gov team has deployed a fix and is monitoring the result. Customers whose applications are failing with memory-related errors should restage their applications with `cf restage` or `cf restage --strategy rolling` and reach out to cloud.gov support via support@cloud.gov if they continue to experience errors.</p><p><small>Feb <var data-var='date'> 8</var>, <var data-var='time'>12:25</var> EST</small><br><strong>Update</strong> - We believe the OOM errors may be caused by a bug in the latest stemcell version pushed to production on January 30. We are deploying an updated version which contains a fix. Deployment is expected to complete after east coast close of business. We will monitor the rollout and post updates as we have them.</p><p><small>Feb <var data-var='date'> 8</var>, <var data-var='time'>10:54</var> EST</small><br><strong>Investigating</strong> - Some applications on cloud.gov have been experiencing intermittent out-of-memory errors while staging or running on the platform starting on January 30. The cloud.gov team is investigating the issue. For apps experiencing OOM errors, 'cf restage' or 'cf restage --strategy rolling' may temporarily resolve the issue.</p>tag:cloudgov.statuspage.io,2005:Incident/201971082024-03-08T19:35:28-05:002024-03-13T10:44:40-04:00Log cache component not returning logs<p><small>Mar <var data-var='date'> 8</var>, <var data-var='time'>19:35</var> EST</small><br><strong>Resolved</strong> - The log cache system has been updated with the renewed certificate. Our testing indicates that real-time logs can now be successfully retrieved using the "cf logs" CLI commands. <br /><br />As with all incidents, the cloud.gov team will conduct a post-mortem analysis of this incident in the coming days and post our findings here as an update. <br /><br />Thank you for being a cloud.gov customer!</p><p><small>Mar <var data-var='date'> 8</var>, <var data-var='time'>17:23</var> EST</small><br><strong>Update</strong> - We have renewed the certificate for the log cache component and we have started a full redeployment of our production system to apply the renewed certificates to the log cache.<br /><br />It may take several hours for the renewed certificate to roll out through the system, but we will post an update once we can confirm the updated certificate has been applied.</p><p><small>Mar <var data-var='date'> 8</var>, <var data-var='time'>17:01</var> EST</small><br><strong>Identified</strong> - We have received reports from customers that using "cf logs" CLI command to retrieve logs from their applications is either not working or not showing recent logs.<br /><br />Customers have confirmed that real-time logs are still being received in the customer logs Elasticsearch/Kibana instance at https://logs.fr.cloud.gov and are being sent correctly through log drains.<br /><br />Our team has already identified the possible cause of this issue as an expired certificate for the Log Cache component, which is the component that the "cf logs" CLI command uses to retrieve logs. The certificate expired at approximately 1:18 PM ET. We are working to remediate the issue.</p>tag:cloudgov.statuspage.io,2005:Incident/192266972024-03-01T00:00:56-05:002024-03-01T00:00:56-05:00AWS RDS deprecating MySQL 5.7.43 - 5.7.44 - Upgrade your databases<p><small>Mar <var data-var='date'> 1</var>, <var data-var='time'>00:00</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Feb <var data-var='date'>29</var>, <var data-var='time'>00:00</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Nov <var data-var='date'>27</var>, <var data-var='time'>09:40</var> EST</small><br><strong>Scheduled</strong> - AWS RDS is deprecating support for MySQL 5.7, versions 5.7.43 - 5.7.44: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Concepts.VersionMgmt.html.<br /><br />Please see our website for guidance on how to upgrade your brokered MySQL 5.7 databases to a supported version: https://cloud.gov/2023/06/05/aws-ending-support-mysql-57/.<br /><br />Any databases that you do not upgrade will get auto-upgraded by AWS in the next maintenance window.<br /><br />If you have any questions or concerns, please contact support@cloud.gov.</p>tag:cloudgov.statuspage.io,2005:Incident/200409972024-02-29T09:43:55-05:002024-02-29T09:43:55-05:00Cloud Foundry Database Upgrade<p><small>Feb <var data-var='date'>29</var>, <var data-var='time'>09:43</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Feb <var data-var='date'>29</var>, <var data-var='time'>09:00</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'>21</var>, <var data-var='time'>14:37</var> EST</small><br><strong>Scheduled</strong> - We’re planning routine maintenance for the cloud.gov API to upgrade the underlying database. <br /><br />This DOES NOT impact your user-facing applications. All applications and their databases will continue to run as normal.<br /><br />During this maintenance window, any developer requests that use the cloud.gov API will not work, including:<br /><br />* CF command-line interface (CLI) commands<br />* cloud.gov dashboard actions<br />* cloud.gov API requests<br /><br />We will send out a notice once the upgrade is complete and developer requests are functional again.<br /><br />If you have any questions or concerns, please contact us at support@cloud.gov.</p>tag:cloudgov.statuspage.io,2005:Incident/200319372024-02-27T13:00:34-05:002024-02-27T13:00:34-05:00cloud.gov Pages builds caching error for Jekyll sites<p><small>Feb <var data-var='date'>27</var>, <var data-var='time'>13:00</var> EST</small><br><strong>Resolved</strong> - Builds are proceeding as normal. If you experience issues building Jekyll with the error "libffi.so.7: cannot open shared object file", please contact pages-support@cloud.gov for resolution.</p><p><small>Feb <var data-var='date'>20</var>, <var data-var='time'>14:25</var> EST</small><br><strong>Monitoring</strong> - cloud.gov Pages customers using Jekyll may be experiencing failing builds due to a caching error on our application. We have identified the fix and are resetting the cache for affected sites. Please contact us if you are experiencing an unexpected Jekyll build error</p>tag:cloudgov.statuspage.io,2005:Incident/199504902024-02-21T09:26:56-05:002024-02-21T09:26:56-05:00Temporary pause for internal cloud.gov platform deployments due to security incident<p><small>Feb <var data-var='date'>21</var>, <var data-var='time'>09:26</var> EST</small><br><strong>Resolved</strong> - This incident has been resolved and all impacted internal development tools are operating as expected.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>13:23</var> EST</small><br><strong>Update</strong> - cloud.gov is continuing to manage a security incident. This incident does not impact platform availability or customer access to cloud.gov. We do not believe at this time that this security incident impacts customer data or application security posture. <br /><br />We are working to resume normal internal pipeline activities. Until fully resolved, this will continue to impact the time to resolution for the Out Of Memory (OOM) issue that we are still addressing. We are continuing to work to fix the OOM issue in the meantime. <br /><br />Please reach out to support@cloud.gov if you are experiencing issues with your applications and we will assist you with mitigations while we work to resolve these incidents.</p><p><small>Feb <var data-var='date'> 9</var>, <var data-var='time'>13:47</var> EST</small><br><strong>Monitoring</strong> - Cloud.gov has become aware of and is managing a security incident which is not impacting platform availability or customer access to cloud.gov. We do not believe at this time that this security incident impacts customer data or application security posture. <br /><br />As a security measure, we have temporarily paused internal cloud.gov platform deployments. This pause will impact the time to resolution for the Out Of Memory (OOM) issue that we are still addressing. We are investigating additional mitigations for the OOM issue in the meantime. <br /><br />Please reach out to support@cloud.gov if you are experiencing issues with your applications and we will assist you with mitigations while we work to resolve these incidents.</p>tag:cloudgov.statuspage.io,2005:Incident/192266872024-01-17T00:00:08-05:002024-01-17T00:00:08-05:00AWS RDS deprecating MySQL 5.7.37 - 5.7.42 - Upgrade your databases<p><small>Jan <var data-var='date'>17</var>, <var data-var='time'>00:00</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>00:01</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Nov <var data-var='date'>27</var>, <var data-var='time'>09:39</var> EST</small><br><strong>Scheduled</strong> - AWS RDS is deprecating support for MySQL 5.7, versions 5.7.37 - 5.7.42: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Concepts.VersionMgmt.html.<br /><br />Please see our website for guidance on how to upgrade your brokered MySQL 5.7 databases to a supported version: https://cloud.gov/2023/06/05/aws-ending-support-mysql-57/.<br /><br />Any databases that you do not upgrade will get auto-upgraded by AWS in the next maintenance window.<br /><br />If you have any questions or concerns, please contact support@cloud.gov.</p>tag:cloudgov.statuspage.io,2005:Incident/191805212024-01-11T10:29:54-05:002024-01-11T10:29:54-05:00Limited support for cloud.gov<p><small>Jan <var data-var='date'>11</var>, <var data-var='time'>10:29</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Nov <var data-var='date'>20</var>, <var data-var='time'>10:07</var> EST</small><br><strong>Scheduled</strong> - With many cloud.gov team members taking time off for the holiday season between December 25, 2023 and January 1, 2024, customers may experience slower response times when contacting support@cloud.gov. Customers should expect responses to their emails within 1 business day. <br /><br />Happy holidays!</p>tag:cloudgov.statuspage.io,2005:Incident/196790732024-01-09T11:00:00-05:002024-01-11T12:30:20-05:00Failures for application rolling restarts<p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>11:00</var> EST</small><br><strong>Resolved</strong> - On January 9 around 11 AM ET, the cloud.gov began deploying an update to CloudFoundry, which is the underlying technology for our platform. About an hour later at 12 PM ET, some customers who were attempting to do a rolling restart of their applications began reporting failures.<br /><br />After some investigation, the cloud.gov determined that there was a bug in part of the CloudFoundry release which was updating database schemas, which in turn was causing application rolling restarts to fail. <br /><br />Once the team recognized the issue, they rolled back our deployment of CloudFoundry to the previous stable version. Customers reported that application restarts had stabilized around 2 PM ET.<br /><br />As an aside, while cloud.gov does have development and staging environments were platform updates are tested before pushing to production, this bug was not caught in those environments because there are no applications that do rolling restarts in those environments.<br /><br />As a follow up, the cloud.gov team did report this deployment bug to the CloudFoundry team and they have opened an issue to resolve it: https://github.com/cloudfoundry/cloud_controller_ng/issues/3592. The cloud.gov team will keep our deployment of CloudFoundry using the latest known stable version until this bug is fixed.<br /><br />As always, we apologize for the inconvenience and we will work to ensure a similar incident does not recur in the future.<br /><br />Thanks for being a cloud.gov customer!</p>tag:cloudgov.statuspage.io,2005:Incident/195001352024-01-09T07:30:25-05:002024-01-09T07:30:25-05:00Database upgrades for cloud.gov Pages<p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>07:30</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>07:00</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Dec <var data-var='date'>21</var>, <var data-var='time'>13:13</var> EST</small><br><strong>Scheduled</strong> - We will be upgrading the cloud.gov Pages database for general maintenance to the platform. There will be some downtime while the database goes through the upgrade process. We anticipate roughly 5 to 10 minutes and it will only affect the platform application and any running site builds. <br /><br />This will NOT affect any live customer sites.</p>tag:cloudgov.statuspage.io,2005:Incident/193600222023-12-19T10:00:15-05:002023-12-19T10:00:15-05:00Deprecation of node v16 for Pages site builds<p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>10:00</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>09:00</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Dec <var data-var='date'> 6</var>, <var data-var='time'>18:32</var> EST</small><br><strong>Scheduled</strong> - As part of our ongoing reliability and stability enhancements to cloud.gov Pages, use of node v16 will be deprecated as of December 19, 2023. Builds that specify an unsupported version in .npmrc will fail with the following messages in the build logs:<br /><br />"Unsupported node major version specified in .nvmrc. Please upgrade to LTS major version 18 or 20, see https://nodejs.org/en/about/releases/ for details."<br /><br />In addition, the default node version, when one is not specified in .npmrc, will be v18.<br /><br />More information is available in our documentation at https://cloud.gov/pages/documentation/node-on-pages/.<br /><br />If you have any questions, please contact pages-support@cloud.gov.</p>tag:cloudgov.statuspage.io,2005:Incident/189927822023-11-01T15:21:00-04:002023-11-03T15:51:49-04:00Partial outage for CDN-based traffic<p><small>Nov <var data-var='date'> 1</var>, <var data-var='time'>15:21</var> EDT</small><br><strong>Resolved</strong> - Customers whose traffic passes through a Cloudfront CDN may have experienced intermittent failures when trying to load their applications or webpages. <br /><br />The cause of the problem was an incorrect header value name used when applying rate limits to traffic. The problem has been resolved and customers using a CDN should no longer see degraded performance or failures from their applications.</p>tag:cloudgov.statuspage.io,2005:Incident/189897752023-11-01T13:13:09-04:002023-11-01T13:13:09-04:00Email to cloud.gov support may be delayed<p><small>Nov <var data-var='date'> 1</var>, <var data-var='time'>13:13</var> EDT</small><br><strong>Resolved</strong> - This issue is resolved. Since we didn't observe any impact to customer service, we will not be providing further updates.</p><p><small>Nov <var data-var='date'> 1</var>, <var data-var='time'>11:15</var> EDT</small><br><strong>Update</strong> - We are receiving email from external domains, but we have not received an "all clear" from GSA IT, so we don't have evidence that the issue is resolved.<br /><br />What seems most expedient in this case: If your email to support@cloud.gov has not received a response today, please call (redacted) to reach a member of our team.</p><p><small>Nov <var data-var='date'> 1</var>, <var data-var='time'>10:12</var> EDT</small><br><strong>Monitoring</strong> - GSA IT has informed us they are working on an issue causing emails from external (non-GSA) senders to be delayed up to several hours before reaching gsa.gov inboxes. <br /><br />We are currently receiving some email to support@cloud.gov, but we are not able to determine if we are receiving all mail sent to us in a timely manner.<br /><br />If you've attempted to reach cloud.gov, you may want to try again. If this issue persist for another hour, we will provide alternative means for reaching our support team.</p>tag:cloudgov.statuspage.io,2005:Incident/189480102023-10-27T11:30:00-04:002023-11-01T17:10:09-04:00Cloud.gov and api.fr.cloud.gov Outage<p><small>Oct <var data-var='date'>27</var>, <var data-var='time'>11:30</var> EDT</small><br><strong>Resolved</strong> - From approximately 11:34 AM ET – 1:38 PM ET, while attempting to mitigate previous DDOS attacks, new WAF rules were added to the platform load balancers. This resulted in some traffic which was targeting api.fr.cloud.gov to be blocked. An additional change at 1:34 PM ET caused access to a majority of the platform to be blocked until 1:38 PM ET.<br /><br />The outage was resolved when the WAF rule changes were reverted and deployed into production at 1:38 PM EDT. <br /><br />Timeline:<br />11:31 AM ET: An internal cloud.gov tool began failing and alerting the platform team to investigate the failure.<br /><br />12:15 PM ET: A small subset of cloud.gov customers connecting to api.fr.cloud.gov from within the platform began to notice failures to connect.<br /><br />1:30 PM ET: The platform team began to investigate the latest changes to the WAF rules as a possible problem.<br /><br />1:35 PM ET: Customers notified us they could no longer access cloud.gov or access api.fr.cloud.gov.<br /><br />1:38 PM ET: The WAF rules changes were reverted and functionality to the platform was restored.<br /><br />Update to this incident - post this notice some additional customers notified us that a large portion of the platform lost access to their applications but access was restored. This happened during the 1:34 to 1:38 window EDT.</p>tag:cloudgov.statuspage.io,2005:Incident/189025732023-10-21T20:30:00-04:002023-11-01T14:34:37-04:00Some customers received HTTP 5XX errors accessing their applications<p><small>Oct <var data-var='date'>21</var>, <var data-var='time'>20:30</var> EDT</small><br><strong>Resolved</strong> - The cloud.gov platform experienced a burst of traffic to the platform which caused some customer applications to experience HTTP 5XX messages while trying to access their applications. This occurred on 10/21 between 18:30 to 19:30 and again from 20:00 to 22:00 EDT.<br /><br />At this time the cloud.gov support team has identified the issue, is monitoring for future events, and is working on implementing additional solutions into production to mitigate future events. At this time the platform is fully available and during the event no customer applications were down, just a subset of applications were not accessible from the internet.<br /><br />If you have any questions or concerns please reach out to cloud.gov support at support@cloud.gov</p>tag:cloudgov.statuspage.io,2005:Incident/188576592023-10-19T13:30:00-04:002023-10-19T15:43:37-04:00Some customers receiving HTTP 5XX errors accessing their applications<p><small>Oct <var data-var='date'>19</var>, <var data-var='time'>13:30</var> EDT</small><br><strong>Resolved</strong> - Today the cloud.gov platform experienced a burst of traffic to the platform which caused some customer applications to experience HTTP 5XX messages while trying to access their applications. This occurred between 13:23 and 13:25 EDT today October 19th.<br /><br />At this time the cloud.gov support team has identified the issue, is monitoring for future events, and working on implementing a solution into production to mitigate future events. At this time the platform is fully available and during the event no customer applications were down, just a subset of applications were not accessible from the internet.<br /><br />If you have any questions or concerns please reach out to cloud.gov support at support@cloud.gov</p>tag:cloudgov.statuspage.io,2005:Incident/189267962023-10-13T17:00:00-04:002023-10-25T15:45:09-04:00Brokered S3 credentials written to internal logs<p><small>Oct <var data-var='date'>13</var>, <var data-var='time'>17:00</var> EDT</small><br><strong>Resolved</strong> - What happened?<br /><br />On Friday October 13, a cloud.gov engineer found that the cloud.gov S3 broker, which manages the S3 buckets that customers broker using the platform, was printing AWS secret access keys in clear text to its logs. Specifically, whenever a customer created a service key or created a binding between an S3 service instance and one of their applications, the newly created key was written to the broker log and indexed by our log management system.<br /><br />How has cloud.gov responded?<br /><br />We have modified the S3 broker to stop printing these credentials and we have removed all log lines containing the keys from our active log indices. However, the log lines will remain in our long-term log archives due to our retention obligations.<br /><br />We emailed the managers for all organizations on cloud.gov on October 17 to notify them of the incident. Per our incident response process, we are cross-posting the information to Statuspage now so we can publish a public postmortem.<br /><br />What is the scope of the issue?<br /><br />The relevant code and configuration was introduced more than six years ago. Because of the time span, it is likely that all current and past brokered S3 buckets are affected.<br /><br />Next steps<br /><br />Because of the factors listed above, although there was no public compromise of your credentials, we recommend rotating your access keys at this time. <br /><br />You can rotate your keys by un-binding and re-binding your S3 buckets to each bound application, and deleting and re-creating any service keys associated with the service instance. Full instructions have been sent to the managers of all cloud.gov organizations. If you did not receive this email or require additional assistance, please reach out to support@cloud.gov for help.<br /><br />Is my data at risk?<br /><br />Cloud.gov does not believe your data is at risk. The logging system is designed to segregate logs from customer tenant access, and we have no indications of unauthorized access. We are continually monitoring our system and will reach out to your team directly if we detect any suspicious activity.<br /><br />Why did this happen?<br /><br />Cloud.gov has held an incident postmortem and will post it to Statuspage shortly.<br /><br />Please feel free to reach out to our support team at support@cloud.gov if you have questions or concerns.</p>tag:cloudgov.statuspage.io,2005:Incident/186697442023-10-05T08:48:29-04:002023-10-05T08:48:29-04:00Increased 504 responses to customer applications<p><small>Oct <var data-var='date'> 5</var>, <var data-var='time'>08:48</var> EDT</small><br><strong>Resolved</strong> - Since implementing the second production fix yesterday, the platform has been stable and working now as expected. We are closing this incident.</p><p><small>Oct <var data-var='date'> 4</var>, <var data-var='time'>07:59</var> EDT</small><br><strong>Update</strong> - 8 AM EDT update - the cloud.gov support has deployed the additional fix to production and is now monitoring the system.</p><p><small>Oct <var data-var='date'> 3</var>, <var data-var='time'>16:17</var> EDT</small><br><strong>Update</strong> - 16:15 EDT update - the cloud.gov team is still seeing some spikes in traffic after the production roll-out. The team is working on an additional fix that will be deployed once it passes testing in lower environments. We will update this incident once this additional fix is deployed to production.</p><p><small>Oct <var data-var='date'> 3</var>, <var data-var='time'>12:33</var> EDT</small><br><strong>Monitoring</strong> - The cloud.gov support team has deployed a fix to production and will be monitoring the system for the rest of the day.</p><p><small>Oct <var data-var='date'> 3</var>, <var data-var='time'>07:49</var> EDT</small><br><strong>Update</strong> - 7:45 EDT update - the cloud.gov support team is aware of another traffic spike yesterday evening and is working on a solution to the issue. Currently that solution is in deployment/testing in lower environments and the team expects to deploy the fix into production later on today.</p><p><small>Oct <var data-var='date'> 2</var>, <var data-var='time'>15:22</var> EDT</small><br><strong>Identified</strong> - Twice today the cloud.gov platform experienced high amounts of internal traffic on the platform which caused some customer applications to experience HTTP 504 messages while accessing their applications. These where brief periods of time, around 4 minutes each time, and the platform recovered automatically. At no time did customers applications stop or go down on the platform.<br /><br />At this time the cloud.gov support team has identified the issue, is monitoring for future events, and working on implementing a solution into production to mitigate future events. At this time the platform is fully available.</p>tag:cloudgov.statuspage.io,2005:Incident/172697842023-09-28T15:00:58-04:002023-09-28T15:00:58-04:00Deprecation Notice for cflinuxfs3 stack<p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>15:00</var> EDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>09:00</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>May <var data-var='date'>16</var>, <var data-var='time'>15:21</var> EDT</small><br><strong>Scheduled</strong> - The base OS image used by your cloud.gov applications is called a “stack”. The stack cloud.gov provided to date is called cflinuxfs3, and it’s based on Ubuntu 18.04 LTS, released originally in mid 2018 with continuous security updates since then. cflinuxfs4 is a new OS image based on Ubuntu 22.04 LTS, and is now the default stack in cloud.gov. <br /><br />Important Dates<br /><br />Ubuntu 18.04 will likely no longer receive security updates in May, so cloud.gov will stop supporting the cflinuxfs3 stack and buildpacks in cloud.gov. <br /><br />On September 28th, 2023, all support for cflinuxfs3 will end and all applications still on this stack will stop and cannot be started unless migrated to cflinuxfs4.<br /><br />Who is impacted?<br /><br />If you push your Cloud Foundry applications as Docker containers with the command "cf push --docker-image" , these changes do not impact you.<br /><br />However, most cloud.gov customers deploy their applications using buildpacks, and their apps don’t have any dependency on the particular OS version that runs them. If that describes you and you have existing applications running on cloud.gov, this upgrade will impact you and you’ll need to update the stack on your applications.<br /><br />What should you do now for existing apps?<br /><br />For existing applications which were created under cflinuxfs3 you will need to update the stack declaration to cflinuxfs4, there are two common ways of doing this detailed below. The options below only have to be run once for each application on cflinuxfs3, once the stack is set for an application, it is persistent until changed with any of these two steps:<br /><br />Step 1:<br /><br />Push the app manually and specify the stack with the cf cli command: "cf push MY-APP -s cflinuxfs4"<br /><br />Step 2:<br /><br />Use the stack-auditor cf cli plugin to change the stack without having to push the application.<br /> <br />Documentation for using this plugin is at https://docs.cloudfoundry.org/adminguide/stack-auditor.html#change-stacks<br /><br />The basic workflow is:<br /><br />* Install the plugin<br />* Use the cf cli to target the org and space for your existing application<br />* Run command: "cf change-stack APP-NAME cflinuxfs4" to change the app to the cflinuxfs4 stack<br /><br />Each application will take about a minute or so to run the cf change-stack command depending on the size of the droplet. Please note if this change over is successful, the update to your running application instances would be the same to your application availability as if you have done a CF PUSH of your application. <br /><br />If this change over fails, the process will automatically rollback to the best of the tools ability but it might fail and any rollback process will cause an outage to your application. It is in your best interest to validate the stack change in a lower environment like dev, stage, or QA. <br /><br />What should you do now for new apps?<br /><br />For any new applications, simply run a cf push to pick up the new cflinuxfs4 stack:<br /><br />Command: "cf push MY-APP"<br /><br /><br />If you have any questions or concerns, please contact us at support@cloud.gov.</p>tag:cloudgov.statuspage.io,2005:Incident/184695992023-09-12T09:39:49-04:002023-09-12T09:39:49-04:00Cloud Foundry cf CLI v7 apt downloads - GPG error<p><small>Sep <var data-var='date'>12</var>, <var data-var='time'>09:39</var> EDT</small><br><strong>Resolved</strong> - The Cloud Foundry CLI team has communicated that the upstream issue has been resolved. Downloads from apt should now succeed without issue. If you encounter any further issues, please reach out to support@cloud.gov.</p><p><small>Sep <var data-var='date'>11</var>, <var data-var='time'>13:27</var> EDT</small><br><strong>Update</strong> - The Cloud Foundry CLI team is working to resolve the issue. See update: https://github.com/cloudfoundry/cli/issues/2571#issuecomment-1714271869</p><p><small>Sep <var data-var='date'>11</var>, <var data-var='time'>10:24</var> EDT</small><br><strong>Identified</strong> - The issue was identified in the original message and we await resolution from the Cloud Foundry team.</p><p><small>Sep <var data-var='date'>11</var>, <var data-var='time'>09:14</var> EDT</small><br><strong>Investigating</strong> - Customers who install the Cloud Foundry (cf) CLI using the apt package manager are encountering the following error:<br /><br />W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://cf-cli-debian-repo.s3.amazonaws.com stable InRelease: The following signatures were invalid: EXPKEYSIG 172B5989FCD21EF8 CF CLI Team <cf-cli-eng@pivotal.io><br />W: Failed to fetch https://packages.cloudfoundry.org/debian/dists/stable/InRelease The following signatures were invalid: EXPKEYSIG 172B5989FCD21EF8 CF CLI Team <cf-cli-eng@pivotal.io><br />W: Some index files failed to download. They have been ignored, or old ones used instead.<br /><br />This issue is being tracked upstream with this issue: https://github.com/cloudfoundry/cli/issues/2571<br /><br />The cloud.gov team recommends installing the CLI using an alternate installation method while the Cloud Foundry team works to resolve the issue. Instructions for installing directly from an installer or compressed binary are available here: https://github.com/cloudfoundry/cli/blob/main/doc/installation-instructions/installation-instructions-v7.md#installers-and-compressed-binaries</p>tag:cloudgov.statuspage.io,2005:Incident/181871522023-08-17T14:30:00-04:002023-08-18T14:23:29-04:00Partial outage: cloud.gov customer applications<p><small>Aug <var data-var='date'>17</var>, <var data-var='time'>14:30</var> EDT</small><br><strong>Resolved</strong> - From approximately 2:30pm – 3:00pm ET, a subset of customer applications running only one instance may have been unavailable due to a failing platform virtual machine. The VM was recreated and applications were rescheduled to other nodes in the pool at that point. <br /><br />As a reminder, customers are encouraged to run multiple instance of their production applications to avoid downtime during platform maintenance and partial outages. See https://cloud.gov/docs/management/multiple-instances/ for more.</p>tag:cloudgov.statuspage.io,2005:Incident/178992842023-07-22T16:30:14-04:002023-07-22T16:30:14-04:00GSA SSO Maintenance<p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>16:30</var> EDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>09:00</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jul <var data-var='date'>19</var>, <var data-var='time'>07:47</var> EDT</small><br><strong>Scheduled</strong> - GSA Customers<br /><br />For GSA users of cloud.gov the platform uses GSA's SecureAuth for IDP access to the platform. GSA will be doing maintenance during this period that can cause limited service interruption to SecureAuth. During this window GSA users might fail to authenticate to the cloud.gov platform. For these specific users it is recommended that if you need cloud.gov platform access during this maintenance window that you authenticate to SecureAuth before the maintenance window starts to maintain access. <br /><br />For GSA applications running on the platform that use SecureAuth for end user authentication in their apps, these apps will experience limited interruption in communications with SecureAuth as well during this maintenance window.<br /><br />All Other Customers<br /><br />This only effects GSA users of the platform. All other customers using their agency IDP or the cloud.gov IDP are not effected by this maintenance.</p>tag:cloudgov.statuspage.io,2005:Incident/178992742023-07-22T00:30:11-04:002023-07-22T00:30:11-04:00GSA SSO Maintenance<p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>00:30</var> EDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>21:00</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jul <var data-var='date'>19</var>, <var data-var='time'>07:46</var> EDT</small><br><strong>Scheduled</strong> - GSA Customers<br /><br />For GSA users of cloud.gov the platform uses GSA's SecureAuth for IDP access to the platform. GSA will be doing maintenance during this period that can cause limited service interruption to SecureAuth. During this window GSA users might fail to authenticate to the cloud.gov platform. For these specific users it is recommended that if you need cloud.gov platform access during this maintenance window that you authenticate to SecureAuth before the maintenance window starts to maintain access. <br /><br />For GSA applications running on the platform that use SecureAuth for end user authentication in their apps, these apps will experience limited interruption in communications with SecureAuth as well during this maintenance window.<br /><br />All Other Customers<br /><br />This only effects GSA users of the platform. All other customers using their agency IDP or the cloud.gov IDP are not effected by this maintenance.</p>tag:cloudgov.statuspage.io,2005:Incident/178918042023-07-18T14:06:26-04:002023-08-03T11:16:52-04:00cloud.gov logins using idp.fr.cloud.gov failing<p><small>Jul <var data-var='date'>18</var>, <var data-var='time'>14:06</var> EDT</small><br><strong>Resolved</strong> - This incident was resolved by the time we posted this incident.<br /><br />Logins were failing from 12:55 pm Eastern until 2:01 pm Eastern.<br /><br />We are investigating why the certificate did not get auto-updated, nor why alerting did not catch this pre-expiration.<br /><br />We will post a retrospective in the coming days.</p><p><small>Jul <var data-var='date'>18</var>, <var data-var='time'>14:03</var> EDT</small><br><strong>Identified</strong> - cloud.gov Pages and cloud.gov Platform developers are unable to log in using the cloud.gov Idp due to an expired TLS (SSL) certificate. We will be resolving shortly.</p>tag:cloudgov.statuspage.io,2005:Incident/172697502023-06-29T12:36:35-04:002023-06-29T12:36:35-04:00Deprecation Notice for cflinuxfs3 Buildpacks<p><small>Jun <var data-var='date'>29</var>, <var data-var='time'>12:36</var> EDT</small><br><strong>Completed</strong> - The removal of cflinuxfs3 buildpacks is complete.</p><p><small>Jun <var data-var='date'>29</var>, <var data-var='time'>09:00</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>May <var data-var='date'>16</var>, <var data-var='time'>15:18</var> EDT</small><br><strong>Scheduled</strong> - The base OS image used by your cloud.gov applications is called a “stack”. The stack cloud.gov provided to date is called cflinuxfs3, and it’s based on Ubuntu 18.04 LTS, released originally in mid 2018 with continuous security updates since then. cflinuxfs4 is a new OS image based on Ubuntu 22.04 LTS, and is now the default stack in cloud.gov. <br /><br />Important Dates<br /><br />Ubuntu 18.04 will likely no longer receive security updates in May, so cloud.gov will stop supporting the cflinuxfs3 stack and buildpacks in cloud.gov. <br /><br />On June 29th, 2023 the platform will no longer provide cflinuxfs3 buildpacks. Applications will need to reference an external buildpack to continue to push updated versions of cflinuxfs3 applications. Existing cflinuxfs3 applications will continue to restart without intervention.<br /><br />Who is impacted?<br /><br />If you push your Cloud Foundry applications as Docker containers with the command "cf push --docker-image" , these changes do not impact you.<br /><br />However, most cloud.gov customers deploy their applications using buildpacks, and their apps don’t have any dependency on the particular OS version that runs them. If that describes you and you have existing applications running on cloud.gov, this upgrade will impact you and you’ll need to update the stack on your applications.<br /><br />What should you do now for existing apps?<br /><br />For existing applications which were created under cflinuxfs3 you will need to update the stack declaration to cflinuxfs4, there are two common ways of doing this detailed below. The options below only have to be run once for each application on cflinuxfs3, once the stack is set for an application, it is persistent until changed with any of these two steps:<br /><br />Step 1:<br /><br />Push the app manually and specify the stack with the cf cli command: "cf push MY-APP -s cflinuxfs4"<br /><br />Step 2:<br /><br />Use the stack-auditor cf cli plugin to change the stack without having to push the application.<br /> <br />Documentation for using this plugin is at https://docs.cloudfoundry.org/adminguide/stack-auditor.html#change-stacks<br /><br />The basic workflow is:<br /><br />* Install the plugin<br />* Use the cf cli to target the org and space for your existing application<br />* Run command: "cf change-stack APP-NAME cflinuxfs4" to change the app to the cflinuxfs4 stack<br /><br />Each application will take about a minute or so to run the cf change-stack command depending on the size of the droplet. Please note if this change over is successful, the update to your running application instances would be the same to your application availability as if you have done a CF PUSH of your application. <br /><br />If this change over fails, the process will automatically rollback to the best of the tools ability but it might fail and any rollback process will cause an outage to your application. It is in your best interest to validate the stack change in a lower environment like dev, stage, or QA. <br /><br />What should you do now for new apps?<br /><br />For any new applications, simply run a cf push to pick up the new cflinuxfs4 stack:<br /><br />Command: "cf push MY-APP"<br /><br />How do you push a cflinuxfs3 app with an external buildpack?<br /><br />Until September 28th, 2023, you can use an external buildpack to push apps to the cflinuxfs3 stack by referencing a URL in a `cf push` command. <br /><br />As an example, to push a Ruby app using 2.7.6 on cflinuxfs3:<br /><br />Command: "cf push MY-APP -b https://github.com/cloudfoundry/ruby-buildpack/releases/download/v1.9.4/ruby-buildpack-cflinuxfs3-v1.9.4.zip -s cflinuxfs3"<br /><br />Many of the external buildpacks can be found on Github at https://github.com/cloudfoundry?q=buildpacks&type=all&language=&sort= <br /><br /><br />If you have any questions or concerns, please contact us at support@cloud.gov.</p>