The loss function is a method of estimating how well your machine learning algorithm models your outstanding data set. In other words, there are loss functions a good measure of how well your model is performing in terms of predicting the expected outcome.
In the Cost function and loss function refer to the same context (i.e., a learning process that uses a feedback spread to reduce error between the actual outcome and the expected outcome). We calculate the cost function as the average of all the values of the loss function, while we calculate the loss function for each sample product relative to its actual value.
The loss function is directly related to the predictions of the model you have built. If the value of your loss function is low, your model will give good results. The loss function (or rather, the cost function) that you use to evaluate the performance of a model should be reduced to improve its performance.
What are the loss functions in machine learning?
The loss function is a method of estimating how well your machine learning algorithm models your outstanding data set. In other words, loss functions are a good measure of your model in terms of predicting the expected outcome.
Broadly speaking It can be said that the loss functions can be divided into two main categories depending on the types of problems we face in the real world: classification and regression. In classification problems, our task is to predict the probabilities according to all problem classes. On the other hand, when it comes to regression, our task is to predict the constant value of a set of independent features of a learning algorithm.
- n / m – number of educational examples
- I amminds an example of teaching in a data set
- y (i) – Actual value of the i-th study sample
- y_hat (i) – Predicted value for Example 1 of the study
Classification of losses
Types of classification losses
- Bilateral cross-entropy loss / record loss
- Ring loss
1. Binary cross-entropy loss / record loss
This is the most common loss function used in classification problems. Cross-entropy losses are reduced as the predicted probability approaches the actual mark. It measures the performance of a classification model whose expected yield is the probability value between
When the number of classes
2it is binary classification.
When the number of classes from
2it is multi-class classification.
We derive the formula for the loss between entropies from a normal function of probability, but with the addition of logarithms.
2. Ring loss
The second most common loss function used for classification problems, and an alternative to the loss function between entropy, is loop loss, which is primarily designed to estimate a machine vector support model (SVM).
The loss of the ring punishes wrong predictions and wrong predictions that are not credible. It is mainly used with SVM classifications with class labels
1. Make sure you get your malignant grade labels out
Types of regression losses
- Average error / square loss / L2 loss
- Absolute average error / loss L1
- Huber / Flat loss means absolute error
- Log-Kosh loss
- Quantitative loss
1. Mean error / square loss / L2 loss
We define the MSE loss function as the mean of the square of the difference between the actual and projected value. This is the most common regression loss function.
The appropriate cost function is this meaning of these quadratic errors (MSE). The MSE loss function punishes the model for committing large errors by squaring them, and this property makes the MSE value function less stable than the external indicators. So you shouldn’t use it if the data is prone to a lot of external things.
2. Absolute Error Meaning / L1 Loss
We define the MAE loss function as the mean of the absolute difference between the actual and projected value. This is the second most common regression loss function. It measures the average size of errors in a set of predictions without considering their direction.
The appropriate cost function is this meaning of these absolute errors (MAE). The MAE loss function is more stable than the MSE loss function compared to external indicators. So are you should use it if the data is prone to a large number of externalities.
3. Huber / Flat loss means absolute error
Huber’s loss function is defined as the set of loss functions of MSE and MAE as it approaches MSE when 𝛿 ~ 0 and MAE when 𝛿 ~ ∞ (large numbers). This is the absolute mean error that becomes quadratic when the error is small. The squaring of the error depends on how small the error can be, which is controlled by the hyper parameter, 𝛿 (delta), which you can adjust.
Choosing a delta value is important because it determines what you want to consider excessively. Therefore, the Huber loss function depending on the value of the hyper parameter may be less sensitive to the MSE loss function than to the external indicators. Therefore, you can use the Huber loss function if the data is inclined to external indicators. In addition, we need to study the hyperparalle delta, which is a repetitive process.
4. Log-Double loss
The log-coefficient loss function is defined as the logarithm of the hyperbolic cosine of the prediction error. This is another function used in regression functions, which is much smoother than MSE losses. It has all the advantages of Huber loss because it differs everywhere twice, unlike Huber loss, because some training algorithms like XGBoost use Newton’s method for finding the optimal and therefore the second product (Hessian).
“Magazine (cosh (x)) approximately equal to (x ** 2) / 2 for small x and to abs (x) – log (2) for large x. This means that the ‘logcosh’ basically works like a quadratic mean error, but is not affected by the brutally wrong prediction sometimes. ”
5. Loss of quantity
Quantity is the value that a portion of the samples in a group is less than. Machine learning models work by reducing (or maximizing) the task at hand. As the name implies, we use the quantum regression loss function to predict quanta. For the set of forecasts, the loss will be moderate.
Quantitative loss function when we are interested in space predictions, not just point predictions.