What KPIs can be used to measure the success of artificial intelligence projects?-AI-php.cn

What KPIs can be used to measure the success of artificial intelligence projects?

王林

Release： 2023-04-10 09:21:05

forward

1314 people have browsed it

A research report released by the research firm IDC in June 2020 showed that approximately 28% of artificial intelligence plans failed. Reasons cited in the report were a lack of expertise, a lack of relevant data and a lack of a sufficiently integrated development environment. In order to establish a process for continuous improvement of machine learning and avoid getting stuck, identifying key performance indicators (KPIs) is now a priority.

What KPIs can be used to measure the success of artificial intelligence projects?

#In the upper reaches of the industry, data scientists can define the technical performance indicators of the model. They will vary depending on the type of algorithm used. In the case of a regression aimed at predicting someone's height as a function of their age, for example, one can resort to linear determination coefficients.

An equation to measure the quality of the prediction can be used: If the square of the correlation coefficient is zero, the regression line determines the 0% point distribution. On the other hand, if the coefficient is 100%, the number is equal to 1. Therefore, this indicates that the quality of the predictions is very good.

Deviation of predictions from reality

Another metric for evaluating regression is the least squares method, which refers to the loss function. It involves quantifying the error by calculating the sum of squared deviations between the actual value and the predicted line, and then fitting the model by minimizing the squared error. In the same logic, one can utilize the mean absolute error method, which consists in calculating the average of the fundamental values of the deviations.

Charlotte Pierron-Perlès, who is responsible for strategy, data and artificial intelligence services at French consultancy Capgemini, concluded: "In any case, this amounts to measuring the gap with what we are trying to predict."

For example, In classification algorithms for spam detection, it is necessary to look for false positives and false negatives of spam. Pierron Perlès explains: "For example, we developed a machine learning solution for a cosmetics group that optimizes the efficiency of a production line. The aim was to identify defective cosmetics at the beginning of the production line that could cause production interruptions. We worked closely with the factory operators Discussions followed with them seeking a model to complete the detection even if it meant detecting false positives, that is, qualified cosmetics could be mistaken for defective."

Based on false positives and false negatives The concept of three other metrics allows the evaluation of classification models:

(1) Recall (R) refers to a measure of model sensitivity. It is the ratio of correctly identified true positives (taking positive coronavirus tests as an example) to all true positives that should have been detected (positive coronavirus tests and negative coronavirus tests were actually positive): R = true positives / true positives false Negative.

(2) Precision (P) refers to the measure of accuracy. It is the ratio of correct true positives (positive COVID-19 tests) to all results determined to be positive (positive COVID-19 tests negative COVID-19 tests): P = true positives / true positives false positives.

(3) Harmonic mean (F-score) measures the model’s ability to give correct predictions and reject other predictions: F=2×precision×recall/precision-recall

of the model Promotion

DavidTsangHinSun, chief senior data scientist at French ESNKeyrus company, emphasized: "Once a model is built, its generalization ability will become a key indicator."

So how to estimate it? By measuring the difference between predictions and expected results, and then understanding how that difference evolves over time. He explains, "After a period of time, we may encounter divergence. This may be due to underlearning (or overfitting) due to insufficient training of the data set in terms of quality and quantity."

So what is the solution? For example, in the case of image recognition models, adversarial generative networks can be used to increase the number of pictures learned through rotation or distortion. Another technique (applicable to classification algorithms): synthetic minority oversampling, which consists of increasing the number of low-occurrence examples in the data set through oversampling.

Disagreement can also occur in the case of over-learning. In this configuration, the model will not be restricted to the expected correlations after training, but due to overspecialization, it will capture the noise generated by the field data and produce inconsistent results. DavidTsangHinSun pointed out, "It is then necessary to check the quality of the training data set and possibly adjust the weight of the variables."

While the economic key performance indicators (KPIs) remain. Stéphane Roder, CEO of French consulting firm AIBuilders, believes: “We have to ask ourselves whether the error rate is consistent with the business challenges. For example, the insurance company Lemonade has developed a machine learning module that can respond to customer requests within 3 minutes after filing a claim. information (including photos) to pay insurance benefits to the customer. Taking into account the savings, a certain error rate incurs costs. Over the entire life cycle of the model, especially compared to the total cost of ownership (TCO), from development to maintenance , it is very important to check this measurement value."

Adoption Level

Even within the same company, expected key performance indicators (KPIs) may vary. Charlotte Pierron Perlès of Capgemini noted: "We developed a consumption forecasting engine for a French retailer with an international standing. It turned out that the precise targeting of the model differed between products sold in department stores and new products. Sales of the latter Dynamics depend on factors, especially those related to market reaction, which are, by definition, less controllable."

The final key performance indicator is adoption levels. Charlotte Pierron-Perlès said: "Even if a model is of good quality, it is not enough on its own. This requires the development of artificial intelligence products with a user-oriented experience that can be used for business and realize the promise of machine learning."

Stéphane Roder concluded: “This user experience will also allow users to provide feedback, which will help provide artificial intelligence knowledge outside the daily production data flow.”

The above is the detailed content of What KPIs can be used to measure the success of artificial intelligence projects?. For more information, please follow other related articles on the PHP Chinese website!