An ECC-based Fault Tolerance Approach for DNNs
سال انتشار: 1403
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 58
فایل این مقاله در 8 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
AISOFT02_063
تاریخ نمایه سازی: 17 فروردین 1404
چکیده مقاله:
Deep Neural Network (DNN) has achieved great success in solving a wide range of machine learning problems. Recently, they have been deployed in datacenters (potentially for business-critical or industrial applications) and safety-critical systems such as self-driving cars. So, their correct functionality in the presence of potential bit-flip errors on DNN parameters stored in memories plays the key role in their applicability in safety-critical applications. In this paper, a fault tolerance approach based on Error Correcting Codes (ECC), called SPW, is proposed to ensure the correct functionality of DNNs in the presence of bit-flip faults. In the proposed approach, error occurrence is detected by the stored ECC and then, it is corrected in case of a single-bit error or the weight is completely set to zero (i.e. masked) otherwise. A statistical fault injection campaign is proposed and utilized to investigate the efficacy of the proposed approach. The experimental results show that the accuracy of the DNN increases by more than ۳۰۰% in the presence with Bit Error Rate of ۱۰-۱ in comparison to the case where ECC technique is applied, in expense of just ۴۷.۵% area overhead.
کلیدواژه ها:
Embedded systems ، Uncertainty ، Reliability ، Performance ، Deep convolutional neural network ، Error correction
نویسندگان
Mohsen Raji
School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
Mohammad Zaree
School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
Kimia Soroush
School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran