The RED firmware did not handle bad blocks on the flash drive in the RED 50 correctly when writing a new firmware during an upgrade.
Early last year, we were alerted to an increase in RMAs for our RED 50 Remote Ethernet Devices.
During our root cause analysis, we found that the RED firmware, which was current at the time, did not handle bad blocks on the flash drive in the RED 50 correctly when writing a new firmware during an upgrade.
What is a bad block? A bad block or bad sector is an area on any storage medium which becomes unusable. Usually such sectors are worn from use, making them unreliable or unusable after a certain number of write and erase cycles. This is a normal process and most storage media will develop some bad blocks over time or even have some when new. Usually they are just skipped during processing.
In the case of the RED 50, this meant that the firmware could no longer be booted after the upgrade process and the RED 50 was bricked. In response to this, we developed a new RED 50 firmware which addressed this problem and handled bad blocks correctly. This firmware was released both to the RED provisioning server and as part of Sophos UTM 9.605 in July 2019.
Unfortunately, despite this update, we continue to see an increased volume of RMAs for the RED 50. This is partially due to the fact that the process of writing the new firmware to get the ‘fix’ still involves using the old firmware, which is affected by the bad block handling issue. This has resulted in some customers running into this issue when upgrading to the newer firmware.
We have also had some reports of customers who have upgraded to the newer firmware and their RED 50 devices then run without issue for a period of time, only to experience a failure at a later date. After further engineering analysis of some devices sent for RMA, we were able to confirm that a) they’re not experiencing the original upgrade issue meaning the fix worked, and b) their flash drives developed bad blocks over time causing the firmware to become unbootable.
We are now working with the supply chain and the hardware team to ascertain whether we’re experiencing a higher than expected failure rate on these drives in the RED 50. Additionally, we continue to analyze units sent for RMA to identify if there are other potential causes for the increased RMA rate.
What advice should you give your customers?
Customers facing this issue should contact support. The support team will select the most appropriate course of action on a case-by-case basis.
Does this issue only affect customers using SG UTM?
We have currently only had reports of this issue in combination with Sophos UTM.