MoMA STORE

【トラブルシューティング】複数の Microsoft 365 サービスで問題が発生する可能性のある SI(サービス インシデント)が発生中【MO449914】【PIR】

  • 2022/10/25
  • Masahiro

Microsoft Office
Microsoft Office

Microsoft が提供する Microsoft 365 のサービス正常性(MO449914)にて、Exchange Online を利用する複数の Microsoft 365 サービスで問題が発生する可能性のある SI(サービス インシデント)が発生しています。

“MO449914” のサマリー

影響
現在
  • ・新たに再導入したインフラストラクチャーを追加で監視した結果、サービスが引き続き安定していたため、24 時間以内に残りのインフラストラクチャーを再導入してサービスを再開することを進めています。
  • ※ 緩和措置と良好なテレメトリー信号から、この問題が再発することはない見込みとなっています。
  • ・この問題の根本的な原因の調査
原因
  • ・日本国内ユーザー向けの Exchange Onlineインフラストラクチャーの一部で、リソースの消費が予想以上に増加しました。
  • このため、DNS トラフィックが処理のしきい値を超え、サービスに影響が発生しました。
  • ・この問題の根本的な原因は現在も調査中であり、詳細については、最終的な PIR(Post – Incident Report)にて報告されます。
影響範囲
Exchange Online に接続しようとする日本国内のユーザー

目次

Users may be experiencing issues with multiple Microsoft 365 services – MO449914

サービス:Exchange Online、Microsoft 365 suite、Microsoft Teams

状態:Service restored

ユーザーへの影響:Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online

Microsoft 365 のサービス正常性(MO449914)にて、Exchange Online を利用する複数の Microsoft 365 サービスで問題が発生する可能性のある SI(サービス インシデント)が発生しています。

本事象は、Exchange Online に固有の問題になりますが、Exchange Online を利用するいくつかのサービス(Microsoft Teams および SharePoint Online など)にも影響がおよぶ可能性があります。

現在は、新たに再導入されたインフラストラクチャーを追加で監視した結果、サービスの健全性は引き続き安定していたため、24 時間以内に残りのインフラストラクチャーを再導入してサービスを再開します。

なお、緩和措置として良好なテレメトリー信号から、この問題が再発することはないと判断しているほか、本事象の根本的な原因の調査を行っています。

本事象は、日本国内ユーザー向けの Exchange Online インフラストラクチャーの一部で、リソースの消費が予想以上に増加したため、DNS トラフィックが処理のしきい値を超え、サービスに影響を及ぼしていました。

また、根本的な原因については現在も調査中となり、最終的な PIR(Post – Incident Report)でお知らせされます。

PIR(Post – Incident Report)は、5 営業日以内に発行します。

この問題は、日本国内のユーザーが、何らかの接続方法で Exchange Online に接続しようとした場合に発生します。

Post Incident Repot(PIR)

Incident Information

Important Note
  • This is a preliminary Post Incident Report (PIR) that is being delivered to provide early insight into details of the issue.
  • The information in this PIR is preliminary and subject to change.
  • A final PIR will be provided within five (5) business days from full event resolution and will supersede this document upon publication.
Incident ID
MO449914、EX449908
Incident Title
Users may be unable to connect to Exchange Online via any connection method
Service(s) Impacted
Exchange Online and dependent services.

User Impact

  • Some users were unable to connect to Exchange Online and dependent services through any connection method.
  • Dependent services included Microsoft Teams and SharePoint Online.
  • Users may have been unable to view calendar items, create events, or view events within the Microsoft Teams service.
  • Other SharePoint Online and Teams features may have also been impacted if they relied on Exchange.
  • During the recovery process some users may have experienced partial relief.
  • While we saw connectivity restored for most users, some residual impact persisted, causing email send and receive delays.

Scope of Impact

Some users residing in or routed through infrastructure within the Japan region would have experienced this issue.

Incident Start Date and Time

Monday, October 24, 2022, at 11:20 AM UTC

Incident End Date and Time

Tuesday, October 25, 2022, at 12:00 PM UTC

Root Cause

  • A portion of Exchange Online infrastructure serving users within Japan experienced an unexpected increase in consumption of resources.
  • This caused DNS traffic to exceed processing thresholds resulting in service impact.
  • The underlying cause of the issue is under investigation and additional details will be provided in the final PIR.

Actions Taken (All times UTC)

October 25
  • ・11:20 AM – When availability first started to decrease.
  • ・12:08 AM – Communications were posted to the SHD (Microsoft 365 Service Health Dashboard) under EX449908 based on our monitoring alerts indicating an availability issue with the service.
  • We began investigating the impact on our services and users.
  • ・12:27 AM – We confirmed that system monitoring first identified a drop in availability within the Japan region on Monday, October 24, 2022, at 11:20 PM UTC.
  • ・12:45 AM – We determined that this availability drop was primarily affecting access to Exchange online through all connection methods. We investigate directory service errors to isolate the cause.
  • ・1:27 AM – We identified that several domain controllers within Japan were currently in maintenance mode.
  • ・2:27 AM – We confirm that some Exchange dependent features like meetings, calendar and the GAL were also impacted and may have been inaccessible.
  • As these features may be accessed through SharePoint Online and Microsoft Teams, an additional SHD post was created under MO449914 to reflect this impact.
  • ・2:49 AM – We began restarting select domain controllers to confirm if the failures stopped from those machines.
  • ・3:30 AM – We begin to reroute traffic through alternate infrastructure as an alternative and a potentially more expedient route to mitigate impact.
  • ・4:45 AM – We continue with both mitigation actions and start to see some recovery.
  • ・6:00 AM – We completed the majority of the restarts and confirm that most users previously affected are seeing relief.
  • We begin targeted restarts to mitigate impact for the remaining users still in a degraded state.
  • ・6:15 AM – We determine most residual impact is related to delays sending and receiving email as most users are now able to connect to the service.
  • ・7:30 AM – We continued with our targeted restarts to address residual impact
  • ・8:00 AM – The majority of our targeted restarts have been completed, though we continue to see some very limited impact and work to understand if additional factors are contributing to the problem.
  • ・10:00 AM – At this time we expect a small set of users to be experiencing some intermittent residual impact.
  • We continued our work to address this remaining impact.
  • ・11:00 AM – We performed load balancing operations as part of our effort to address the residual impact.
  • Additionally, we implemented a configuration change to disable some failover and routing logic we believe contributed to the issue.
  • ・1:10 PM – We completed our configuration change, and continue to monitor the service to confirm the issue is fully resolved
  • ・2:00 PM – Our monitoring reflected that the issue has been mitigated as availability remains healthy.
  • At this time traffic has decreased in Japan and we decide to perform extended monitoring to ensure the issue does not reproduce during peak traffic in Japan (core business hours).
October 26
  • ・・6:00 AM – After extended monitoring through peak traffic we confirmed that the issue is fully mitigated.
  • We begin returning the remaining infrastructure back into active service. We also continue our investigation into the underlying cause of the issue.

Next Steps

Findings
Action
Completion Date
  • A portion of Exchange Online infrastructure serving users within Japan experienced an unexpected increase in consumption of resources.
  • This caused DNS traffic to exceed processing thresholds, resulting in service impact.
  • The underlying cause of the issue is under investigation and additional details will be provided in the final PIR.
We’re performing an extensive investigation to determine the underlying factors that resulted in impact.
11/1/2022
*TBD – Pending item 1
*TBD – Pending item 1

※ Next Steps are still being investigated a part of our post incident investigation.

デル株式会社

October 28, 2022 7:49 AM – Service restored

A post-incident report has been published.

October 26, 2022 2:58 PM – Service restored

  • ・Title : Users may have experienced issues with multiple Microsoft 365 services
  • ・User Impact : Users may have experienced issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・Final status : After additional monitoring of the newly re-introduced infrastructure, service health continued to remain stable, so we’re moving ahead with returning the remaining infrastructure back into service over the next 24 hours.
  • Based on the mitigation steps taken and positive telemetry signals, we are confident that the issue will not reoccur.
  • In parallel, we’re continuing to investigate the underlying cause of the incident.
  • ・Scope of impact : This issue impacted users within the Japan region when attempting to connect to Exchange Online via any connection method..
  • ・Start time : Monday, October 24, 2022, 8:20 PM (11:20 AM UTC)
  • ・End time : Tuesday, October 25, 2022, 9:00 PM (12:00 PM UTC)
  • ・Preliminary Root cause : A portion of Exchange Online infrastructure serving users within Japan experienced an unexpected increase in consumption of resources.
  • This caused DNS traffic to exceed processing thresholds resulting in service impact.
  • The underlying cause of the issue is under investigation and additional details will be provided in the final PIR.
  • ・Next steps : – For a more comprehensive list of next steps and actions, please refer to the Post Incident Review document.
  • We’ll publish a post-incident report within five business days.

October 26, 2022 12:45 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・Current status : We’ve monitored service health over the past two hours and it has remained stable as expected.
  • We’ve identified additional mitigation steps that should recover backend infrastructure that was taken out of rotation during the course of our mitigation steps and we’re beginning to re-introduce that infrastructure back into service over the next couple hours.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • ・Next update by : Wednesday, October 26, 2022, 3:00 PM (6:00 AM UTC)

October 26, 2022 10:45 AM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・Current status : Our telemetry indicates that service health remains stable as we enter core Japanese business hours.
  • We’ll continue to monitor service health over the next couple hours to validate that the service is working as expected. Additionally, our investigation into the underlying root cause is ongoing.
  • ・Scope of impact: This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • ・Next update by : Wednesday, October 26, 2022, 1:00 PM (4:00 AM UTC)

October 26, 2022 12:01 AM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・Current status : The service has remained healthy and we’re entering a period of extended monitoring throughout the night and into Japanese core business hours to ensure that the issue does not reoccur.
  • Furthermore, we’re analyzing dump files collected during impact from affected back-end components to investigate the underlying root cause.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • ・Next update by : Wednesday, October 26, 2022, 11:00 AM (2:00 AM UTC)

October 25, 2022 11:10 PM · クイック更新 – Service restored

  • ・Current status : We’re continuing to monitor service telemetry, which indicates that the service performance remains within acceptable thresholds.
  • Additionally, we continue to investigate the underlying root cause of impact.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 10:37 PM · クイック更新 – Service restored

  • ・Current status : Monitoring indicates that the service is performing within acceptable thresholds; however, we’re continuing to monitor service telemetry to ensure there is no residual impact.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 8:29 PM · クイック更新 – Service restored

  • ・Current status : The configuration change to disable some failover and routing logic is progressing as expected and we’re monitoring it as it deploys through the affected environment.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 8:01 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・Current status : We continue to perform targeted reboots of the affected infrastructure and load balancing operations to provide relief.
  • Additionally, we’re implementing a configuration change to disable some failover and routing logic we believe is contributing to impact.
  • Our telemetry continues to indicate that the majority of the users have now recovered and we’re monitoring the service availability to ensure full recovery.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • ・Next update by : Tuesday, October 25, 2022, 10:00 PM (1:00 PM UTC)

October 25, 2022 7:30 PM · クイック更新 – Service restored

  • ・Current status : We’re continuing to perform additional reboots to expedite the mitigation efforts.
  • Our monitoring indicates that the availability is improving, however, some users may still experience residual impact.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 6:53 PM · クイック更新 – Service restored

  • ・Current status : We’re performing additional reboots to expedite the mitigation efforts.
  • At this time, some users may experience some residual impact and we’re investigating this further.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 6:10 PM · クイック更新 – Service restored

  • ・Current status : While we continue to perform targeted mitigation actions to ensure complete recovery, our telemetry indicates that the vast majority of the users have now recovered.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 5:31 PM · クイック更新 – Service restored

  • ・Current status : Our monitoring telemetry indicates that the majority of the users have recovered.
  • We’ll continue to monitor the service to ensure full recovery and will perform mitigation actions as needed.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 4:48 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・More info : Impact may be intermittent in nature.
  • Additionally, users may also notice delays when sending and receiving emails.
  • While impact is primarily specific to Exchange Online, some services that leverage Exchange Online may also be impacted.
  • These services could include but are not limited to, Microsoft Teams and SharePoint Online.
  • ・Current status : We’re continuing to perform targeted reboots within the affected environments.
  • Our telemetry indicates that the majority of the users have now recovered and we’re continuing to monitor the service to ensure full recovery.
  • Meanwhile, we’re continuing our analysis into the service telemetry to fully understand the underlying cause and fully resolve the issue.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • ・Next update by: Tuesday, October 25, 2022, 8:00 PM (11:00 AM UTC)

October 25, 2022 4:07 PM · クイック更新 – Service restored

  • We’re continuing to perform targeted restarts on the affected environments while monitoring the service for recovery.
  • We’re also investigating the underlying cause of the issue.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 3:27 PM · クイック更新 – Service restored

  • We’re completing targeted restarts on affected environments, and we are monitoring service availability as we do so.
  • Concurrently, we are still in the process of isolating the source of the issue to ensure we prevent it from reoccurring.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 3:01 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・More info : While impact is primarily specific to Exchange Online, some services that leverage Exchange Online may also be impacted.
  • These services could include but are not limited to, Microsoft Teams and SharePoint Online.
  • Further, we’ve received reports that some users are unable to view calendar items, create events, or view events within the Microsoft Teams service.
  • ・Current status : Our telemetry indicates that a subset of users are in a recovered state however; some users may still experience residual intermittent impact.
  • To address this residual impact, we’ll continue to perform targeted reboots to the remaining affected environments, and we’ll carefully monitor the service availability.
  • Additionally, we’ll continue analyzing available service logs to help further determine what is causing the problem and resolve fully.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • Additionally, this issue may impact services and/or applications that leverage Exchange Online.
  • ・Next update by : Tuesday, October 25, 2022, 5:00 PM (8:00 AM UTC)

October 25, 2022 2:24 PM · クイック更新 – Service restored

  • We’re continuing our investigation to pinpoint the root cause of impact.
  • We’re monitoring our recent mitigation efforts to determine their efficacy.
  • At this time, we are still seeing residual impact and are investigating further.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 1:38 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online
  • ・More info : While impact is primarily specific to Exchange Online, some services that leverage Exchange Online may also be impacted.
  • These services could include but are not limited to, Microsoft Teams and SharePoint Online.
  • Further, we’ve received reports that some users are unable to view calendar items, create events, or view events within the Microsoft Teams service.
  • ・Current status : We’ve completed our traffic redirection to healthy infrastructure and are seeing some signs of recovery; however, some users still have intermittent impact.
  • We’ll continue rebooting some of our affected infrastructure while we further isolate the root cause of the issue.
  • We are also reviewing other services that could compound the issue and continue exploring alternate mitigation strategies.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • Additionally, this issue may impact services and/or applications that leverage Exchange Online.
  • ・Next update by : Tuesday, October 25, 2022, 4:00 PM (7:00 AM UTC)

October 25, 2022 12:46 PM · クイック更新 – Service restored

  • Our traffic redirection to healthy infrastructure is complete. Our reboots are continuing and we’re seeing some recovery for some environments as well.
  • We’ll revert traffic once we’ve confirmed our infrastructure remains at optimal levels.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 12:20 PM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online.
  • ・More info : While impact is primarily specific to Exchange Online, some services that leverage Exchange Online may also be impacted.
  • These services could include but are not limited to, Microsoft Teams and SharePoint Online.
  • Further, we’ve received reports that some users are unable to view calendar items, create events, or view events within the Microsoft Teams service.
  • ・Current status : We’ve identified a potential alternate mitigation option by redirecting traffic to reduce any traffic flowing through the affected environments.
  • We’ll continue to monitor the environment to confirm this action has the intended effect.
  • Once confirmed, we’ll attempt to incrementally re-introduce the traffic upon the determination the impacted infrastructure has recovered.
  • We’re still investigating what is causing these machines to go into an unhealthy state.
  • ・Scope of impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • Additionally, this issue may impact services and/or applications that leverage Exchange Online.
  • ・Next update by : Tuesday, October 25, 2022, 3:00 PM (6:00 AM UTC)

October 25, 2022 11:44 AM · クイック更新 – Service restored

  • We continue to reboot servers as a potential mitigation step while we work on identifying the root cause of the issue.
  • This quick update is designed to give the latest information on this issue.

October 25, 2022 11:27 AM – Service restored

  • ・Title : Users may be experiencing issues with multiple Microsoft 365 services
  • ・User Impact : Users may be experiencing issues with multiple Microsoft 365 services that leverage Exchange Online.
  • ・More info : While impact is primarily specific to Exchange Online, some services that leverage Exchange Online may also be impacted.
  • These services could include but are not limited to, Microsoft Teams and SharePoint Online.
  • Further, we’ve received reports that some users are unable to view calendar items, create events, or view events within the Microsoft Teams service.
  • ・Current status : We’ve narrowed down the issue to a smaller portion of service infrastructure that appears to have entered an unhealthy state and is resulting in impact.
  • We’re continuing our efforts to isolate what caused these machines to fail initially.
  • Concurrently, we’ve rebooted some affected infrastructure as a potential mitigation effort.
  • ・Scope of Impact : This issue impacts users within the Japan region when attempting to connect to Exchange Online via any connection method.
  • Additionally, this issue may impact services and/or applications that leverage Exchange Online.
  • ・Next update by : Tuesday, October 25, 2022, 2:00 PM (5:00 AM UTC)

デル株式会社

Microsoft 365 suite 関連記事一覧

Microsoft 365 suite 関連記事一覧

Microsoft 365 suite のメッセージ センター関連情報一覧

Microsoft 365 suite のメッセージ センター関連記事一覧

Microsoft 365 suite サービス正常性 関連記事一覧

Microsoft 365 suite サービス正常性 関連記事一覧

関連リンク




コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

このサイトはスパムを低減するために Akismet を使っています。コメントデータの処理方法の詳細はこちらをご覧ください

カテゴリー

  • Apple
  • Windows
  • オーディオ
  • カメラ
  • スマートフォン
  • Hobby
  • 旅行

デル株式会社

ゴールデン ウィーク 新しいパソコンで GW を楽しもう

  • PR by DELL Technologies

  • PR by final

  • PR by e ☆ イヤホン