After all these years, we are all convinced that ML can, if not perform better, at least match pre-ML solutions almost everywhere. For example, for some rule constraints, we will all think about whether they can be replaced by a tree-based ML model. But the world isn't always black and white, and while machine learning certainly has its place in solving problems, it's not always the best solution. Rule-based systems can even outperform machine learning, especially in areas where explainability, robustness, and transparency are critical.
In this article, I will introduce some practical cases and how combining manual rules and ML can make our solutions better.
A rule-based system provides support for decision-making through predefined rules. The system evaluates data based on stored rules and performs specific operations based on mappings.
Here are a few examples:
Fraud Detection: In fraud detection, a rules-based system can be used to quickly flag and investigate suspicious transactions based on predefined rules.
For example, for chess cheaters, their basic approach is to install a computer chess application in another window and use the program to play chess. No matter how complex the program is, each step requires 4- 5 seconds to complete. Therefore, a "threshold" is added to calculate the time of each step of the player. If the fluctuation is not large, it may be judged as a cheater, as shown in the following figure:
Health care industry: Rules-based systems can be used to manage prescriptions and prevent medication errors. They can also be very useful in helping doctors prescribe additional analyzes to patients based on previous results.
Supply Chain Management: In supply chain management, rules-based systems can be used to generate low inventory alerts, help manage expiration dates, or new product launches.
Machine learning (ML) systems use algorithms to learn from data and make predictions or take actions without being explicitly programmed. Machine learning systems use knowledge gained through training on large amounts of data to make predictions and decisions about new data. ML algorithms can improve their performance as more data is used for training. Machine learning systems include natural language processing, image and speech recognition, predictive analytics, and more.
Fraud Detection: Banks may use machine learning systems to learn from past fraudulent transactions and identify potential fraudulent activity in real time. Or, it might reverse engineer the system and look for transactions that look very "abnormal."
Healthcare: Hospitals may use ML systems to analyze patient data and predict a patient's likelihood of developing a certain disease based on certain X-rays.
Rule-based systems and ML systems have their own advantages and disadvantages
Rule-based The advantages of the system are obvious:
Disadvantages:
Based on The advantages of ml's system are also obvious
Disadvantages:
Through comparison, we found that the advantages and disadvantages of the two systems do not conflict and are complementary. , so is there a way to combine their advantages?
Hybrid systems, which combine rule-based systems and machine learning algorithms, have become increasingly popular recently Popularity. They can provide more robust, accurate and efficient results, especially when dealing with complex problems.
Let’s take a look at a hybrid system that can be implemented using the rental dataset:
Feature Engineering: Convert Floors to Three One of several categories: high, medium or low, depending on the number of floors in the building. This can improve the efficiency of ML models
Hard-coded rules can be used as part of the feature engineering process to identify and extract important features in the input data. For example, if the problem domain is clear and unambiguous, the rules can be easily and accurately defined, and hard-coded rules can be used to create new features or modify existing features to improve the performance of the machine learning model. Although hardcoding rules and feature engineering are two different techniques, they can be used together to improve the performance of machine learning models. Hard-coded rules can be used to create new features or modify existing features, while feature engineering can be used to extract features that are not easily captured by hard-coded rules.
Post-processing: round or normalize the final result.
Hard-coded rules can be used as part of the post-processing stage to modify the output of the machine learning model. For example, if a machine learning model outputs a set of predictions that are inconsistent with some known rules or constraints, hard-coded rules can be used to modify the predictions so that they comply with the rules or constraints. Post-processing techniques such as filtering or smoothing can refine the output of a machine learning model by removing noise or errors, or improving the overall accuracy of predictions. These techniques are particularly effective when there is uncertainty in the machine learning model's output probabilistic predictions or in the input data. In some cases, post-processing techniques can also be used to enhance the input data with additional information. For example, if a machine learning model is trained on a limited data set, post-processing techniques can be used to extract additional features from external sources (such as social media or news feeds) to improve the accuracy of predictions.
Let’s look at the data on heart disease:
If we use random forest to predict the target class:
clf = RandomForestClassifier(n_estimators=100, random_state=random_seed X_train, X_test, y_train, y_test = train_test_split( df.iloc[:, :-1], df.iloc[:, -1], test_size=0.30, random_state=random_seed ) clf.fit(X_train, y_train))
One of the reasons for choosing random forest here is its ability to build feature importance. Below you can see the importance of the features used for training:
Look at the results:
y_pred = pd.Series(clf.predict(X_test), index=y_test.index cm = confusion_matrix(y_test, y_pred, labels=clf.classes_) conf_matrix = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_) conf_matrix.plot())
f1_score(y_test, y_pred): 0.74 recall_score(y_test, y_pred): 0.747
That’s when a cardiologist sees your model. Based on his experience and domain knowledge, he believes that the thalassemia characteristic (thal) is much more important than shown above. So we decided to build a histogram and see the results.
Then specify a mandatory rule
y_pred[X_test[X_test["thal"] == 2].index] = 1
The resulting confusion matrix becomes like this:
f1_score(y_test, y_pred): 0.818 recall_score(y_test, y_pred): 0.9
The results have been greatly improved. This is where domain knowledge plays an important role in assessing patient scores.
The following data set is bank fraudulent transactions.
The data set is highly imbalanced:
df["Class"].value_counts() 0 28431 1 4925
To create the rules, we look at the box plot of the distribution of the features:
We are going to write our own HybridEstimator class, which will serve as an estimator for our manual rules:
from hulearn.classification import FunctionClassifier rules = { "V3": ("<=", -2), "V12": ("<=", -3), "V17": ("<=", -2), } def create_rules(data: pd.DataFrame, rules): filtered_data = data.copy() for col in rules: filtered_data[col] = eval(f"filtered_data[col] {rules[col][0]} {rules[col][1]}") result = np.array(filtered_data[list(rules.keys())].min(axis=1)).astype(int) return result hybrid_classifier = FunctionClassifier(create_rules, rules=rules)
We can compare pure Results of rule-based system and kNN method. The reason kNN is used here is that it can handle imbalanced data:
As we can see, we With only 3 rules written, it performs better than the KNN model
Our example here may not be very accurate, but it is enough to illustrate that the hybrid model provides practical benefits , such as fast implementation, robustness to outliers and increased transparency. They are beneficial when combining business logic with machine learning. For example, hybrid rule-ML systems in healthcare can diagnose diseases by combining clinical rules with machine learning algorithms that analyze patient data. Machine learning can achieve excellent results on many tasks, but it also requires supplementary domain knowledge. Domain knowledge can help machine learning models better understand data and predict and classify more accurately.
Hybrid models can help us combine domain knowledge and machine learning models. Hybrid models are usually composed of multiple sub-models, each of which is optimized for specific domain knowledge. These sub-models can be models based on hard-coded rules, models based on statistical methods, or even models based on deep learning.
Hybrid models can use domain knowledge to guide the learning process of machine learning models, thereby improving the accuracy and reliability of the model. For example, in the medical field, hybrid models can combine a doctor’s expertise with the power of a machine learning model to diagnose a patient’s disease. In the field of natural language processing, hybrid models can combine linguistic knowledge and the capabilities of machine learning models to better understand and generate natural language.
In short, hybrid models can help us combine domain knowledge and machine learning models, thereby improving the accuracy and reliability of the model, and have a wide range of applications in various tasks.
The above is the detailed content of Combine rule-based and machine learning approaches to build powerful hybrid systems. For more information, please follow other related articles on the PHP Chinese website!