loader image

Preventive detection of failures


Solution context

The client manages the production and maintenance of jet engines. Each engine is characterized by a set of anonymized metrics. Each motor has between 20 and 50 sensors (depending on the type of motor) attached to its internal components. The data of the different sensors are time-stamped but their meaning anonymized, and their scale modified to protect the industrial secret. This solution gave the customer the ability to:

  • ●  Perform exploratory analysis of the different engine and sensor data in the context of the history of maintenance interventions carried out.

  • ●  Prepare a general model or a model by engine type to predict failures at 100 operating cycles (anonymized time scale).

Technical Process

Data Acquisition:

During this phase, we performed a transformation of the different data formats available in the client’s data lake to prepare a coherent dataset. Our approach is to position automatic jobs that collect the data and submit it to a unified scheme.

The jobs developed are of two types:

  • ●  Configuration jobs: Who are responsible for collecting data on engines, their configuration, and operating environments.

  • ●  Streaming jobs: This takes care of the almost real-time loading of sensor data during the operation of the machines.

    The goal is to filter and transform the anonymized data to have a batch large enough to perform analysis and prediction.

Exploratory Analysis

  1. The data collected was subject to the following process:

    • ●  Univariate analysis: Allowing to explore the values that the different variables take, to study their distribution and to extract the outliers.

    • ●  Bivariate analysis: Whose role is to extract the intrinsic relationships

      between these variables and study their correlations to avoid

      introducing tightly correlated variables and thus introduce bias.

    • ●  Multivariate analysis: Applied to the engine and environmental data,

      this analysis reduced the dimensions of the problem and provided additional variables to the sensor data.

Time series analysis: This analysis concerns the data of the sensors; it was carried out in order to be able to extract the modes of operation and the different apparent cycles in the data

Result :

The colors correspond to the different configurations of the engines

Prediction Model

The goal of prediction is to be able to recognize a failure at least 100 operating cycles. We started with the use of classical time series prediction models (moving average, SARIMA, etc.) The fundamental problem with these methods is the lack of seasonality in the data, which ends up giving a model that converges to the general average of the future data.

The boosting model with feedback allowed us to arrive at a precision of 85% to detect the 100-cycle failure. The model made it possible to highlight the importance of the different sensors (below) .

The artificial neural network has allowed us to have a 97% accuracy for 100-cycle prediction

Residual Analysis and Integration

After conducting the residual error study, we noticed that these errors occur mainly in the context of certain engine configurations that are rarely used and therefore not very present in the data lake. The client’s opinion was to exclude these configurations from the study since they do not have a strategic advantage. The developed models have been integrated into a web-based dashboard that allows both the collection of new sensor data and the running of simulations with user-set parameters. The application also made it possible to create alerts for the planning of preventive maintenance actions. We installed an MLFlow pipeline in the back office for model maintenance to feed it with new data and restart learning.