Reliability assessment of deep neural networks implemented in hardware
DOI:
https://doi.org/10.26507/paper.4346Keywords:
Deep Neural Networks, Fault Injection, Saboteur, Reliability AssessmentAbstract
Deep neural networks ((Deep Neural Networks - DNNs) have been adopted to develop mission-critical applications, including autonomous vehicles, life-support medical equipment, and aerospace systems, among others. In these types of applications, a malfunction can trigger serious consequences for human life and the environment, as well as damage to high-cost equipment. Implementing artificial neural networks in hardware brings advantages such as high operating speeds, lower energy consumption, and greater portability. These characteristics have motivated the scientific community to propose new hardware architectures and to make efforts to ensure their reliability. In this context, precisely identifying vulnerabilities in highly complex neural networks synthesized in hardware becomes highly relevant. To this end, fault injection mechanisms are used, consisting of specialized circuits known as 'saboteurs,' which allow emulating faults in hardware components such as registers and memory units, processing blocks, and interconnections, to evaluate their impact on the accuracy and resilience of the system. Fault models can be permanent (stuck-at type) or transient (such as bit flip or coupling faults, among others) and can be applied at different levels of the DNN architecture, ranging from individual neurons and weights to entire layers. One of the most widely used platforms for this type of study is the FPGA (Field Programmable Gate Array), thanks to features such as its flexibility, low-power consumption, speed, and processing capacity. This work presents the development of an environment to evaluate the reliability of DNNs implemented in hardware through a fault injection system that emulates permanent and transient faults, supported by a set of tools for the analysis and verification of the impact of each test. Preliminary results reveal that DNNs exhibit varying degrees of vulnerability to hardware faults and that some critical layers and parameters can seriously affect the performance of each model. The findings underline the importance of understanding the dynamics of fault propagation within DNNs to ensure the design of more reliable hardware and software architectures.
References
Andrade, L., Prost-Boucle, A., & Pétrot, F. (2018). Overview of the state of the art in embedded machine learning. Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018, 2018-Janua, 1033–1038. https://doi.org/10.23919/DATE.2018.8342164
Cannon, E. H., Cabanas-Holmen, M., Wert, J., Amort, T., Brees, R., Koehn, J., Meaker, B., & Normand, E. (2010). Heavy ion, high-energy, and low-energy proton SEE sensitivity of 90-nm RHBD SRAMs. IEEE Transactions on Nuclear Science, 57(6 PART 1), 3493–3499. https://doi.org/10.1109/TNS.2010.2086482
Cherezova, N., Jenihhin, M., & Jutman, A. (2024). IJTAG-compatible Symptom-based SEU Monitors for FPGA DNN Accelerators. 2024 IEEE International Conference on Design, Test and Technology of Integrated Systems, DTTIS 2024, 1–6. https://doi.org/10.1109/DTTIS62212.2024.10780417
Cherezova, N., Shibin, K., Jenihhin, M., & Jutman, A. (2023). Understanding fault-tolerance vulnerabilities in advanced SoC FPGAs for critical applications. Microelectronics Reliability, 146(June 2022). https://doi.org/10.1016/j.microrel.2023.115010
Gao, Z., Gao, S., Yao, Y., Liu, Q., Zeng, S., Ge, G., Wang, Y., Ullah, A., & Reviriego, P. (2023). Systematic Reliability Evaluation of FPGA Implemented CNN Accelerators. IEEE Transactions on Device and Materials Reliability, 23(1), 116–126. https://doi.org/10.1109/TDMR.2023.3235767
Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. In Artificial Intelligence Review (Vol. 53, Issue 8). Springer Netherlands. https://doi.org/10.1007/s10462-020-09825-6
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11), 46. https://doi.org/10.1109/5.726791
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep Learning for Generic Object Detection: A Survey. International Journal of Computer Vision, 128(2), 261–318. https://doi.org/10.1007/s11263-019-01247-4
Martín-Martín, A., Padial-Allué, R., Castillo, E., Parrilla, L., Parellada-Serrano, I., Morán, A., & García, A. (2024). Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces. Sensors, 24(3), 1–21. https://doi.org/10.3390/s24030899
Pagliarini, S. N. (2015). Reliability analysis methods and improvement techniques applicable to digital circuits To cite this version : HAL Id : tel-01195815.
Rech, P., Galliere, J. M., Girard, P., Griffoni, A., Boch, J., Wrobel, F., Saigné, F., & Dilillo, L. (2011). Neuton-induced multiple bit upsets on dynamically-stressed commercial SRAM arrays. Proceedings of the European Conference on Radiation and Its Effects on Components and Systems, RADECS, 274–280. https://doi.org/10.1109/RADECS.2011.6131396
Ruospo, A., Balaara, A., Bosio, A., & Sanchez, E. (2020). A Pipelined Multi-Level Fault Injector for Deep Neural Networks. 33rd IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2020. https://doi.org/10.1109/DFT50435.2020.9250866
Yu, G., Tan, G. O. U., & Huang, H. (2024). A Survey on Failure Analysis and Fault Injection in AI Systems. ACM Computing Surveys, 1(1), 1–36. https://doi.org/10.1145/3477600
How to Cite
Downloads
Downloads
Published
Proceeding
Section
License
Copyright (c) 2025 Asociación Colombiana de Facultades de Ingeniería - ACOFI

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
| Article metrics | |
|---|---|
| Abstract views | |
| Galley vies | |
| PDF Views | |
| HTML views | |
| Other views | |


