Live Status\Incident History

May 11, 2024

Wave Endpoint Issues

Minor Outage

Opened: May 11, 2024, 12:48 PM UTC

Duration: 2h 39m 59s


  • Investigating
    May 11, 2024, 12:48 PM UTC

    Wave is reporting errors causing it to be unavailable for a subset of nextflow/platform users.

  • Identified
    May 11, 2024, 3:15 PM UTC

    The issue was determined to be caused by a Redis cache problem, affecting only customers using the old platform endpoint https://api.tower.nf .

  • Resolved
    May 11, 2024, 3:28 PM UTC

    No further issues should exist.

April 30, 2024

Wave Container Issues

Minor Outage

Opened: Apr 30, 2024, 2:00 PM UTC

Duration: 6h

Affected services:Wave


  • Identified
    Apr 30, 2024, 2:00 PM UTC

    A recent update to Wave caused inconsistency in the composition of container names which may have resulted in some users being unable to run pipelines relying on Wave.

  • Resolved
    Apr 30, 2024, 8:00 PM UTC

    This change has been rolled back and should no longer cause issues.

April 25, 2024

Site Unreachable

Outage

Opened: Apr 25, 2024, 8:23 AM UTC

Duration: 33s

Affected services:License Manager


  • Opened
    Apr 25, 2024, 8:23 AM UTC

    Issue opened on monitor failure.

  • Resolved
    Apr 25, 2024, 8:24 AM UTC

    Issue resolved on monitor recovery.

Site Unreachable

Outage

Opened: Apr 25, 2024, 8:20 AM UTC

Duration: 1m 39s

Affected services:Cloudinfo


  • Opened
    Apr 25, 2024, 8:20 AM UTC

    Issue opened on monitor failure.

  • Resolved
    Apr 25, 2024, 8:22 AM UTC

    Issue resolved on monitor recovery.

March 22, 2024

Wave Connection Issues

Degraded Performance

Opened: Mar 22, 2024, 11:20 PM UTC

Duration: 43h 10m

Affected services:Wave


  • Opened
    Mar 22, 2024, 11:20 PM UTC

    Start of Wave errors Retroactively, it was discovered that transient spikes in Wave container errors had occurred causing temporary degradation of service and impacting a limited number of customers.

  • Investigating
    Mar 24, 2024, 7:37 AM UTC

    A potential issue with the Wave service has been discovered impacting a limited number of clients.

  • Identified
    Mar 24, 2024, 11:56 AM UTC

    Wave issues have been confirmed and a potential cause has been pinpointed.

  • Resolved
    Mar 24, 2024, 6:30 PM UTC

    A change has been pushed to the Seqera platform that has remedied the problem. The initial issue was caused by a misconfiguration that was exhausting the available connection pool.

January 11, 2024

Job Submissions Stalling

Minor Outage

Opened: Jan 11, 2024, 3:00 PM UTC

Duration: 1h 30m


  • Identified
    Jan 11, 2024, 3:00 PM UTC

    An issue has been identified that is causing jobs in Platform Cloud to stall in the "Submitted" state.

  • Resolved
    Jan 11, 2024, 4:30 PM UTC

    The problem was caused by a stuck thread in our job scheduler component that was preventing job submissions from being performed correctly. We restarted the service and implemented a code fix to prevent the scenario from happening again. All stalled jobs should now be proceeding as normal.

November 27, 2023

Wave HTTP Errors

Minor Outage

Opened: Nov 27, 2023, 1:55 AM UTC

Duration: 6h 33m 59s

Affected services:Wave


  • Investigating
    Nov 27, 2023, 1:55 AM UTC

    A timeout issue has been identified that is impacting users of Wave containers [HTTP status=408].

  • Resolved
    Nov 27, 2023, 8:29 AM UTC

    The issue has been identified and resolved.

November 14, 2023

Issue with quay.io container registry

Minor Outage

Opened: Nov 14, 2023, 9:30 PM UTC

Duration: 2h 19m 59s


  • Opened
    Nov 14, 2023, 9:30 PM UTC

    An incident with quay.io container registry is impacting the launch of data pipelines on Seqera Enterprise on-prem installations. We are monitoring the problem. Please contact Seqera support for more details.

  • Resolved
    Nov 14, 2023, 11:50 PM UTC

    The incident was solved by quay.io team.

November 1, 2023

AWS Batch users: "Invalid Certificate Error"

Degraded Performance

Opened: Nov 1, 2023, 4:10 AM UTC

Duration: 5h 54m


  • Investigating
    Nov 1, 2023, 4:10 AM UTC
  • Resolved
    Nov 1, 2023, 10:04 AM UTC

    An "Expired License Error" prevented a small number of users running pipelines on AWS Batch from launching pipelines between midnight and 09:00 CET on November 1st. We have now fixed the issue and we will be implementing solutions to prevent this from happening in the future. We apologize for any inconvenience this may have caused.

October 11, 2023

Tower system temporary connection failure

Minor Outage

Opened: Oct 11, 2023, 8:50 PM UTC

Duration: 40m


  • Investigating
    Oct 11, 2023, 8:50 PM UTC

    A temporary connection failure has been reported by some users. We are investigating the possible problem.

  • Resolved
    Oct 11, 2023, 9:30 PM UTC

    A failure during the rollout of a system upgrade caused some backend replicas not to reach the ready status, resulting in a temporary system unreliability.

October 2, 2023

Incident with Wave service

Minor Outage

Opened: Oct 2, 2023, 12:20 PM UTC

Duration: 0s

Affected services:Wave


  • Resolved
    Oct 2, 2023, 12:20 PM UTC

    An issue has been identified in the Wave service that caused a partial system outage from Fri, September 30 15:30 until Mon, October 2 12:20. The problem is related to a bug in Java HTTP client used by Wave backend to access remote repositories, which caused the system to consume the Java VM heap memory in an uncontrolled manner and exhaust the underlying threads pool handling the incoming client requests. https://bugs.openjdk.org/browse/JDK-8308144 A solution has been delivered to prevent this problem.

October 1, 2023

Site Unreachable

Outage

Opened: Oct 1, 2023, 8:35 AM UTC

Duration: 8m 16s

Affected services:Wave


  • Opened
    Oct 1, 2023, 8:35 AM UTC

    Issue opened on monitor failure.

  • Resolved
    Oct 1, 2023, 8:44 AM UTC

    Issue resolved on monitor recovery.

Site Unreachable

Outage

Opened: Oct 1, 2023, 5:27 AM UTC

Duration: 7m 49s

Affected services:Wave


  • Opened
    Oct 1, 2023, 5:27 AM UTC

    Issue opened on monitor failure.

  • Resolved
    Oct 1, 2023, 5:35 AM UTC

    Issue resolved on monitor recovery.

Site Unreachable

Outage

Opened: Oct 1, 2023, 1:33 AM UTC

Duration: 9m 4s

Affected services:Wave


  • Opened
    Oct 1, 2023, 1:33 AM UTC

    Issue opened on monitor failure.

  • Resolved
    Oct 1, 2023, 1:42 AM UTC

    Issue resolved on monitor recovery.

September 30, 2023

Site Unreachable

Outage

Opened: Sep 30, 2023, 2:30 PM UTC

Duration: 53s

Affected services:Wave


  • Opened
    Sep 30, 2023, 2:30 PM UTC

    Issue opened on monitor failure.

  • Resolved
    Sep 30, 2023, 2:30 PM UTC

    Issue resolved on monitor recovery.

September 29, 2023

Site Unreachable

Outage

Opened: Sep 29, 2023, 1:58 PM UTC

Duration: 15m 28s

Affected services:Wave


  • Opened
    Sep 29, 2023, 1:58 PM UTC

    Issue opened on monitor failure.

  • Resolved
    Sep 29, 2023, 2:14 PM UTC

    Issue resolved on monitor recovery.