Custom cover image
Custom cover image

A Feature-Level Fusion-Based Multimodal Analysis of Recognition and Classification of Awkward Working Postures in Construction

By: Material type: ArticleArticleDescription: 1-17 pISSN:
  • 0733-9364
Subject(s): Online resources: In: ASCE: Journal of Construction Engineering and ManagementSummary: Developing approaches for recognition and classification of awkward working postures is of great significance for proactive management of safety risks and work-related musculoskeletal disorders (WMSDs) in construction. Previous efforts have concentrated on wearable sensors or computer vision-based monitoring. However, certain limitations need to be further investigated. First, wearable sensor-based studies lack reliability due to vulnerability to environmental interferences. Second, conventional computer vision-based recognition demonstrates classification inaccuracy under adverse environmental conditions, such as insufficient illumination and occlusion. To address the above limitations, this study presents an innovative and automated approach for recognizing and classifying awkward working postures. This approach leverages multimodal data collected from various sensors and apparatuses, allowing for a comprehensive analysis of different modalities. A feature-level fusion strategy is employed to train deep learning-based networks, including a multilayer perceptron (MLP), recurrent neural network (RNN), and long short-term memory (LSTM). Among these networks, the LSTM model achieves optimal performance, with an impressive accuracy of 99.6% and an F1-score of 99.7%. A comparison of metrics between single-modality and multimodal-fused training methods demonstrates that the incorporation of multimodal fusion significantly enhances the classification performance. Furthermore, the study examines the performance of the LSTM network under adverse environmental conditions. The accuracy of the model remains consistently above 90% in such conditions, indicating that the model’s generalizability is enhanced through the multimodal fusion strategy. In conclusion, this study mainly contributes to the body of knowledge on proactive prevention for safety and health risks in the construction industry by offering an automated approach with excellent adaptability in adverse conditions. Moreover, this innovative attempt integrating diverse data through multimodal fusion may provide inspiration for future studies to achieve advancements.
Holdings
Item type Current library Call number Vol info Status Date due Barcode
Articles Articles Periodical Section Vol.149, No.12 (December 2023) Available

Developing approaches for recognition and classification of awkward working postures is of great significance for proactive management of safety risks and work-related musculoskeletal disorders (WMSDs) in construction. Previous efforts have concentrated on wearable sensors or computer vision-based monitoring. However, certain limitations need to be further investigated. First, wearable sensor-based studies lack reliability due to vulnerability to environmental interferences. Second, conventional computer vision-based recognition demonstrates classification inaccuracy under adverse environmental conditions, such as insufficient illumination and occlusion. To address the above limitations, this study presents an innovative and automated approach for recognizing and classifying awkward working postures. This approach leverages multimodal data collected from various sensors and apparatuses, allowing for a comprehensive analysis of different modalities. A feature-level fusion strategy is employed to train deep learning-based networks, including a multilayer perceptron (MLP), recurrent neural network (RNN), and long short-term memory (LSTM). Among these networks, the LSTM model achieves optimal performance, with an impressive accuracy of 99.6% and an F1-score of 99.7%. A comparison of metrics between single-modality and multimodal-fused training methods demonstrates that the incorporation of multimodal fusion significantly enhances the classification performance. Furthermore, the study examines the performance of the LSTM network under adverse environmental conditions. The accuracy of the model remains consistently above 90% in such conditions, indicating that the model’s generalizability is enhanced through the multimodal fusion strategy. In conclusion, this study mainly contributes to the body of knowledge on proactive prevention for safety and health risks in the construction industry by offering an automated approach with excellent adaptability in adverse conditions. Moreover, this innovative attempt integrating diverse data through multimodal fusion may provide inspiration for future studies to achieve advancements.