Preparation is the difference between unexpected PACS downtime and a nightmare, Michael D. Toland told his audience in Seattle on May 17 at the 2008 annual meeting of the Society for Imaging Informatics in Medicine. Toland, who is PACS administrative team manager for the University of Maryland Medical System, Baltimore, presented PACS Worst Case Scenarios: Understanding the Implications of Major Downtimes and Avoiding Them.
It is vital to understand the effect that PACS downtime has on the entire enterprise, not just the operation of a single department or service. It is particularly important to note the clinical, business, and risk-management implications of PACS downtime, Toland says, especially as PACS use spreads further beyond radiology.
The effects of downtime should be limited in advance through the development of policies for communication and for escalation of staff intervention in the event of system failure. Of course, procedures should also be established to reduce downtime in the first place. The development of strong relationships with vendors will be a valuable step in this direction.
The Cascade of Failure
Toland recalls that his horror story began when he had been working for just 7 months as a PACS administrator at a 250-bed community hospital that performed about 70,000 exams per year. Toland had come to the hospital from an IT background and was without previous experience in a hospital setting. The hospital’s IT group, which had been active in the PACS acquisition, was still providing support for the system’s hardware.
When one of the system’s storage-array controllers began generating errors, Toland called the vendor for support and was told that the controller should be replaced. Because there was a second controller in operation, and it was adequate to manage the demands placed on the whole system, there was no disruption in PACS operation. This allowed the replacement of the failing controller to be scheduled conveniently (along with appropriate data backups preceding it).
Because the PACS hardware was covered by a service contract, no expense for the replacement was anticipated. Just in case there might be some unexpected difficulties, Toland scheduled the replacement for a Tuesday, after hours, when PACS use was likely to be lightest. After the controller had been replaced, its working status and correct configuration would then be verified.
Unfortunately, Toland’s careful planning was to no avail. Without his knowledge, the vendor’s support technician dropped in at the hospital at 10 am on Friday, having decided not to wait until the following Tuesday to replace the controller. The hospital’s IT staff gave that technician access to the data center without informing Toland. When the bad controller was replaced—before the planned backups had been completed—its blank configuration overwrote the configuration that had been in place on the good controller. Because all array configuration was lost, all PACS data disappeared.
Because Toland’s scheduled full backup had not yet been made when the controller was replaced, it was necessary to base restoration of the PACS data on the full backup of the previous week plus daily differential backups. The most recent of those backups had been done nine hours before the system failed, so data from exams performed during that time had to be restored by resending them from the modalities and the RIS to the PACS.
The PACS had to remain entirely down during a from-scratch rebuilding of the storage array, which took eight hours. Data restoration from backup tapes took 27 hours, and 24 hours were required to complete study reconciliation. The outage affected not only the community hospital, but an outpatient facility, a large medical center, and a regional trauma center that were all linked to the PACS.
More than strictly PACS functions were affected by the unscheduled downtime and the lack of access to prior studies that followed it, which lasted until data restoration was complete. Clinical care was affected, and this had an impact on the facility’s risk management. Likewise, business operations were compromised by lack of PACS availability.
Because all hardware is capable of failing, PACS failures are inevitable, Toland says. While this is beyond the facility’s control, the impact of a failure can be controlled with adequate planning and good policies. When unscheduled downtime takes place, rapid response is vital; the