You are reading the article **Ridge And Lasso Regression Explained** updated in December 2023 on the website Daihoichemgio.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. *Suggested January 2024 Ridge And Lasso Regression Explained*

Two well-liked regularization methods for linear regression models are ridge and lasso regression. They help to solve the overfitting issue, which arises when a model is overly complicated and fits the training data too well, leading to worse performance on fresh data. Ridge regression reduces the size of the coefficients and prevents overfitting by introducing a penalty element to the cost function of linear regression. The squared coefficient total is directly proportional to this penalty component. Adversely, a penalty term is added in lasso regression that is proportionate to the total of the absolute values of the coefficients. This promotes some of the coefficients to approach 0 exactly, rendering some aspects of the model utterly irrelevant. We will examine these two approaches in further detail in this post, talk about how they vary, and look at how scikit-learn may be used to apply them in Python.

Ridge RegressionTo combat the issue of overfitting in linear regression models, ridge regression is a regularization approach. The size of the coefficients is reduced and overfitting is prevented by adding a penalty term to the cost function of linear regression. The penalty term regulates the magnitude of the coefficients in the model and is proportional to the sum of squared coefficients. The coefficients shrink toward zero when the penalty term’s value is raised, lowering the model’s variance.

Ridge regression attempts to reduce the following cost function −

where y is the actual value, h(y) denotes the predicted value, and w denotes the feature coefficient.

Ridge regression works best when there are several tiny to medium-sized coefficients and when all characteristics are significant. Also, it is computationally more effective than other regularization methods. Ridge regression’s primary drawback is that it does not erase any characteristics, which may not always be a good thing. The specific situation at hand and the qualities of the data will determine whether to use Ridge or another regularization approach.

Program import numpy as np from sklearn.linear_model import Ridge from sklearn.metrics import mean_squared_error n_samples, n_features = 100, 10 X = np.random.randn(n_samples, n_features) w_true = np.random.randn(n_features) y = X.dot(w_true) + 0.5*np.random.randn(n_samples) train_size = int(n_samples * 0.8) X_train, X_test = X[:train_size], X[train_size:] y_train, y_test = y[:train_size], y[train_size:] alpha = 0.1 ridge = Ridge(alpha=alpha) ridge.fit(X_train, y_train) y_pred = ridge.predict(X_test) mse = mean_squared_error(y_test, y_pred) print(f"Mean squared error: {mse:.2f}") Output Mean squared error: 0.36We separated the data in this example into training and testing sets using the train test split function from scikit-learn. After that, we scale the data with StandardScaler to make sure that each feature has a comparable range and distribution.

The regularization intensity is then adjusted using the alpha parameter after creating a Ridge regression model with the help of Scikit-Ridge learn’s class. An increase in alpha results in stronger regularization.

use the fit approach to fit the model to the training data and the prediction method to provide predictions on the testing data. The last technique we employ to evaluate the model’s effectiveness is the mean squared error, which computes the average squared difference between the predicted values and the actual values.

Noting that alternative regularization methods like Lasso or Elastic Net may be better suitable in some circumstances, Ridge regression may not always improve the performance of linear regression models. Moreover, cross-validation should be used to fine-tune the regularization strength alpha option to obtain the ideal value that strikes a compromise between model complexity and generalization performance.

Lasso RegressionLasso regression, commonly referred to as L1 regularization, is a method for stopping overfitting in linear regression models by including a penalty term in the cost function. In contrast to Ridge regression, it adds the total of the absolute values of the coefficients rather than the sum of the squared coefficients.

Lasso regression attempts to reduce the following cost function −

where y is the actual value, h(y) denotes the predicted value, and w denotes the feature coefficient.

Lasso regression can reduce certain coefficients to zero, conducting feature selection in effect. With high-dimensional datasets where many characteristics could be unnecessary or redundant, this is very helpful. The resultant model is less complex and easier to understand, and by minimizing overfitting, it frequently exhibits improved predictive performance.

Program import numpy as np from sklearn.linear_model import Lasso from sklearn.metrics import mean_squared_error # Generate some random data n_samples, n_features = 100, 10 X = np.random.randn(n_samples, n_features) w_true = np.random.randn(n_features) y = X.dot(w_true) + 0.5*np.random.randn(n_samples) # Split the data into training and testing sets train_size = int(n_samples * 0.8) X_train, X_test = X[:train_size], X[train_size:] y_train, y_test = y[:train_size], y[train_size:] # Set the regularization strength alpha = 0.1 # Create the Lasso regression object and fit the model lasso = Lasso(alpha=alpha) lasso.fit(X_train, y_train) # Make predictions on the testing set y_pred = lasso.predict(X_test) # Calculate the mean squared error mse = mean_squared_error(y_test, y_pred) # Print the mean squared error print(f"Mean squared error: {mse:.2f}") Output Mean squared error: 0.43In this code, we first produce some random data (100 samples and 10 characteristics). We then divided the data into 80/20 training and testing sets. Then, we set the regularization strength to 0.1 and build a Lasso regression object instance. We then used the fit() function to fit the model to the training data. We use the predict() method to make predictions on the testing data, and the mean squared error between the predicted and actual values is calculated using scikit-mean squared error() learn’s function. Finally, the mean squared error is printed.

It is worth noting that the Lasso regression model performs feature selection by setting some of the coefficients to zero. This means that it might be effective in instances when there are numerous features and we want to find the most essential ones for predicting the target variable. But, if we consider that all of the qualities are relevant for prediction, it may not be the best option. Ridge regression may be a superior option in such instances.

Difference between Ridge and Lasso RegressionRidge Regression

Lasso Regression

Shrinks the coefficients toward zero

and Encourages some coefficients to be exactly zero

Adds a penalty term proportional to the sum of squared coefficients

Adds a penalty term proportional to the sum of absolute values of coefficients

Does not eliminate any features

Can eliminate some features

Suitable when all features are importantly

Suitable when some features are irrelevant or redundant

More computationally efficient

Less computationally efficient

Requires setting a hyperparameter

Requires setting a hyperparameter

Performs better when there are many small to medium-sized coefficients

Performs better when there are a few large coefficients

ConclusionRidge and Lasso’s regression are a powerful technique for regularizing linear regression models and preventing overfitting. They both add a penalty term to the cost function, but with different approaches. Ridge regression shrinks the coefficients towards zero, while Lasso regression encourages some of them to be exactly zero. These techniques can be implemented easily in Python using scikit-learn, making it accessible to a wide audience. By understanding and implementing Ridge and Lasso regression, you can improve the performance of your linear regression models and make more accurate predictions on new data.

You're reading __Ridge And Lasso Regression Explained__

## Memory Sizes: Gigabytes, Terabytes, And Petabytes Explained

If you’re new to computers (or even if you’re not), the names that get applied to different memory sizes can seem strange.

Whether you’re talking about an 8-megabyte memory card, a 500-gigabyte hard drive, or a 1 terabyte SSD drive, the terms always seem abstract and random.

Table of Contents

How exactly do you gauge just how much space a gigabyte, a terabyte, or even a petabyte describes?

What Is a Byte?To understand how the larger blocks of memory work, it’s important to build an appreciation for the smaller blocks of space that those larger ones are made from.

In simple terms, a single byte is typically eight binary digits. A binary digit is a 1 or a 0, which in very old computers literally represented a switch that was on or off.

There are some computer systems that have bytes of other lengths, but most modern computers today are based on an eight-bit byte binary

Those eight bits (a byte) usually represent a character like a letter or number. Bytes can also represent symbols that represent one piece of a larger object like an image.

Since a “byte” is the smallest unit of data, then other names are needed for larger units of data made up of even more bits. The important thing to keep in mind is that all the larger units are made up of a fixed number of bytes, and each byte typically contains eight bits.

As you start stacking up more bytes, you can determine the name of the unit based on the number of bytes.

A Kilobyte is 1,024 BytesYou would think that since the prefix “kilo” typically means 1,000, that kilobyte would have 1,000 bytes.

The reality is that since computers store data using the binary system, and the binary system is based on powers of 2, the actual number of bytes is 1,024.

You can see this when you look at how the power of 2’s works.

2^0 = 1

2^1 = 2

2^2 = 4

2^3 = 8

2^4 = 16

2^5 = 32

2^6 = 64

2^7 = 128

2^8 = 256

2^9 = 512

2^10 = 1024

The first binary value that represents 1,000 bytes is 1,024. Therefore, a kilobyte contains 1,024 bytes.

You can estimate the size that information would require based on the number of characters in that data. Take a 200-page book as an example. Typically, each page in a book has about 300 word per page. That means the entire book is about 60,000 words.

An average word is about 6 characters. That means a 60,000-word book has about 360,000 characters.

To store this book electronically would require 360,000 bytes.

You can represent this in kilobytes (KB) by dividing 360,000 bytes by 1024. This means a 60,000-word book would require about 351.56 kilobytes of digital storage.

What is a Gigabyte?In the metric system, the prefix “Giga” means a unit of measure of 10 to the power of 9, or 1,000,000,000. But remember, to represent this in the computer binary system, it needs to take the binary factor of 2’s into account.

So, working up to Gigabyte using power of 2’s, we’ll need to go all the way to 2^30 to get the first number over 1 billion, which is 1,073,741,824 bytes.

So far you know that a kilobyte is 1,024 bytes. What about everything between 1,024 and 1,073,741,824 ?

Kilobyte (KB): A thousand bytes, or a kilobyte, is 1,024 bytes.

Megabyte (MB): A million bytes, or a megabyte, is represented as 1,024 kilobytes.

Gigabyte (GB): A billion bytes, or a gigabyte, is represented as 1,024 megabytes.

To put the size of a gigabyte into perspective, consider that a single gigabyte can store about 230 music tracks, or almost 600 five-megapixel photographs. You could even store a standard 1.5-hour movie on 1 gigabyte.

What Is a Terabyte?What is the next power of 10 number greater than a billion? That would be a trillion.

The prefix for a trillion is “tera”. A terabyte is 10 to the power of 12 bytes, represented in binary.

That means 1 terabyte (TB) is 1024 gigabytes. Most modern hard drives store half of this amount of data. A terabyte, a trillion bytes, is a lot of information.

In recent years, manufacturers have started releasing new computers with a one or two terabyte drives. It would be very difficult for any user to fill up such a hard drive, unless they’re producing many hours of high-definition video every day.

Consider that a standard floppy drive in the 1990’s could hold only thousands of bytes. A CD-ROM could store 700 megabytes, and a DVD-ROM could store 4.7 GB. But the hard drives of today can store trillions of bytes. A 1 terabyte drive could store 217 DVD-ROM’s worth of data. We’ve come a long way.

What Is a Petabyte?The next storage unit to consider is what’s known as a petabyte.

The prefix “peta” is the measurement unit for one quadrillion, or 10 to the power of 15.

Since this is 1,000 units of one trillion (tera), then one petabyte is equivalent to 1,024 terabytes. That’s one quadrillion bytes.

You would think this volume of information could never be used. However, there are petabytes of information flowing through computer systems and networks today, however hard that may be to believe.

But consider the following modern applications of petabyte sized technology:

Google processes over 24 petabytes of information every day.

Mobile phone networks transmit over 20 petabytes to and from users every day.

The Blue Waters supercomputer has over 500 petabytes of tape storage.

The United States Library of Congress contains over 7 petabytes of digital data in its archives.

World of Warcraft servers require over 1.5 petabytes of storage to run its online game.

The scale of a petabyte is hard to wrap your head around, but once you consider the scenarios above, it becomes quite clear just how much data is involved.

A single petabyte could store over 10,000 hours of television programming. If you filled an entire four-drawer filing cabinet with documents filled with text, you could fit 20 million of those file cabinets into a petabyte.

In fact, you could store every single written manuscript created by humanity since the beginning of recorded history in 50 petabytes.

That’s a lot of data.

Understanding Memory TerminologyIt’s important to understand the units of memory because it’s used everywhere where there’s technology these days. Any time you buy a computer, a mobile phone, or a tablet, the specifications are all written in terms of memory storage, and how much data the technology can transmit.

If you understand all these terms, then you’ll know just how much better one computer is than other. You’ll appreciate how much better a 4G mobile network is than a 3G one. You’ll appreciate how much more you’ll be able to store on a 1 terabyte memory card rather than a 500 megabyte one.

## Ai With Python – Supervised Learning: Regression

AI with Python – Supervised Learning: Regression

Regression is one of the most important statistical and machine learning tools. We would not be wrong to say that the journey of machine learning starts from regression. It may be defined as the parametric technique that allows us to make decisions based upon data or in other words allows us to make predictions based upon data by learning the relationship between input and output variables. Here, the output variables dependent on the input variables, are continuous-valued real numbers. In regression, the relationship between input and output variables matters and it helps us in understanding how the value of the output variable changes with the change of input variable. Regression is frequently used for prediction of prices, economics, variations, and so on.

Building Regressors in PythonIn this section, we will learn how to build single as well as multivariable regressor.

Linear Regressor/Single Variable RegressorLet us important a few required packages −

import numpy as np from sklearn import linear_model import sklearn.metrics as sm import matplotlib.pyplot as pltNow, we need to provide the input data and we have saved our data in the file named linear.txt.

input = 'D:/ProgramData/linear.txt'We need to load this data by using the np.loadtxt function.

input_data = np.loadtxt(input, delimiter=',') X, y = input_data[:, :-1], input_data[:, -1]The next step would be to train the model. Let us give training and testing samples.

training_samples = int(0.6 * len(X)) testing_samples = len(X) - num_training X_train, y_train = X[:training_samples], y[:training_samples] X_test, y_test = X[training_samples:], y[training_samples:]Now, we need to create a linear regressor object.

reg_linear = linear_model.LinearRegression()Train the object with the training samples.

reg_linear.fit(X_train, y_train)We need to do the prediction with the testing data.

y_test_pred = reg_linear.predict(X_test)Now plot and visualize the data.

plt.scatter(X_test, y_test, color = 'red') plt.plot(X_test, y_test_pred, color = 'black', linewidth = 2) plt.xticks(()) plt.yticks(()) plt.show() OutputNow, we can compute the performance of our linear regression as follows −

print("Performance of Linear regressor:") print("Mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred), 2)) print("Mean squared error =", round(sm.mean_squared_error(y_test, y_test_pred), 2)) print("Median absolute error =", round(sm.median_absolute_error(y_test, y_test_pred), 2)) print("Explain variance score =", round(sm.explained_variance_score(y_test, y_test_pred), 2)) print("R2 score =", round(sm.r2_score(y_test, y_test_pred), 2)) OutputPerformance of Linear Regressor −

Mean absolute error = 1.78 Mean squared error = 3.89 Median absolute error = 2.01 Explain variance score = -0.09 R2 score = -0.09In the above code, we have used this small data. If you want some big dataset then you can use sklearn.dataset to import bigger dataset.

2,4.82.9,4.72.5,53.2,5.56,57.6,43.2,0.92.9,1.92.4, 3.50.5,3.41,40.9,5.91.2,2.583.2,5.65.1,1.54.5, 1.22.3,6.32.1,2.8 Multivariable RegressorFirst, let us import a few required packages −

import numpy as np from sklearn import linear_model import sklearn.metrics as sm import matplotlib.pyplot as plt from sklearn.preprocessing import PolynomialFeaturesNow, we need to provide the input data and we have saved our data in the file named linear.txt.

input = 'D:/ProgramData/Mul_linear.txt'We will load this data by using the np.loadtxt function.

input_data = np.loadtxt(input, delimiter=',') X, y = input_data[:, :-1], input_data[:, -1]The next step would be to train the model; we will give training and testing samples.

training_samples = int(0.6 * len(X)) testing_samples = len(X) - num_training X_train, y_train = X[:training_samples], y[:training_samples] X_test, y_test = X[training_samples:], y[training_samples:]Now, we need to create a linear regressor object.

reg_linear_mul = linear_model.LinearRegression()Train the object with the training samples.

reg_linear_mul.fit(X_train, y_train)Now, at last we need to do the prediction with the testing data.

y_test_pred = reg_linear_mul.predict(X_test) print("Performance of Linear regressor:") print("Mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred), 2)) print("Mean squared error =", round(sm.mean_squared_error(y_test, y_test_pred), 2)) print("Median absolute error =", round(sm.median_absolute_error(y_test, y_test_pred), 2)) print("Explain variance score =", round(sm.explained_variance_score(y_test, y_test_pred), 2)) print("R2 score =", round(sm.r2_score(y_test, y_test_pred), 2)) OutputPerformance of Linear Regressor −

Mean absolute error = 0.6 Mean squared error = 0.65 Median absolute error = 0.41 Explain variance score = 0.34 R2 score = 0.33Now, we will create a polynomial of degree 10 and train the regressor. We will provide the sample data point.

polynomial = PolynomialFeatures(degree = 10) X_train_transformed = polynomial.fit_transform(X_train) datapoint = [[2.23, 1.35, 1.12]] poly_datapoint = polynomial.fit_transform(datapoint) poly_linear_model = linear_model.LinearRegression() poly_linear_model.fit(X_train_transformed, y_train) print("nLinear regression:n", reg_linear_mul.predict(datapoint)) print("nPolynomial regression:n", poly_linear_model.predict(poly_datapoint)) OutputLinear regression −

[2.40170462]Polynomial regression −

[1.8697225]In the above code, we have used this small data. If you want a big dataset then, you can use sklearn.dataset to import a bigger dataset.

2,4.8,1.2,3.22.9,4.7,1.5,3.62.5,5,2.8,23.2,5.5,3.5,2.16,5, 2,3.27.6,4,1.2,3.23.2,0.9,2.3,1.42.9,1.9,2.3,1.22.4,3.5, 2.8,3.60.5,3.4,1.8,2.91,4,3,2.50.9,5.9,5.6,0.81.2,2.58, 3.45,1.233.2,5.6,2,3.25.1,1.5,1.2,1.34.5,1.2,4.1,2.32.3, 6.3,2.5,3.22.1,2.8,1.2,3.6Advertisements

## Machine Learning Using C++: A Beginner’s Guide To Linear And Logistic Regression

Why C++ for Machine Learning?

The applications of machine learning transcend boundaries and industries so why should we let tools and languages hold us back? Yes, Python is the language of choice in the industry right now but a lot of us come from a background where Python isn’t taught!

The computer science faculty in universities are still teaching programming in C++ – so that’s what most of us end up learning first. I understand why you should learn Python – it’s the primary language in the industry and it has all the libraries you need to get started with machine learning.

But what if your university doesn’t teach it? Well – that’s what inspired me to dig deeper and use C++ for building machine learning algorithms. So if you’re a college student, a fresher in the industry, or someone who’s just curious about picking up a different language for machine learning – this tutorial is for you!

In this first article of my series on machine learning using C++, we will start with the basics. We’ll understand how to implement linear regression and logistic regression using C++!

Let’s begin!

Note: If you’re a beginner in machine learning, I recommend taking the comprehensive Applied Machine Learning course.

Linear Regression using C++Let’s first get a brief idea about what linear regression is and how it works before we implement it using C++.

Linear regression models are used to predict the value of one factor based on the value of another factor. The value being predicted is called the dependent variable and the value that is used to predict the dependent variable is called an independent variable. The mathematical equation of linear regression is:

Y=B0+B1 XHere,

X: Independent variable

Y: Dependent variable

B0: Represents the value of Y when X=0

B1: Regression Coefficient (this represents the change in the dependent variable based on the unit change in the independent variable)

For example, we can use linear regression to understand whether cigarette consumption can be predicted based on smoking duration. Here, your dependent variable would be “cigarette consumption”, measured in terms of the number of cigarettes consumed daily, and your independent variable would be “smoking duration”, measured in days.

Loss FunctionThe loss is the error in our predicted value of B0 and B1. Our goal is to minimize this error to obtain the most accurate value of B0 and B1 so that we can get the best fit line for future predictions.

For simplicity, we will use the below loss function:

e(i) = p(i) - y(i)Here,

e(i) : error of ith training example

p(i) : predicted value of ith training example

y(i): actual value of ith training example

Overview of the Gradient Descent AlgorithmGradient descent is an iterative optimization algorithm to find the minimum of a function. In our case here, that function is our Loss Function.

Here, our goal is to find the minimum value of the loss function (that is quite close to zero in our case). Gradient descent is an effective algorithm to achieve this. We start with random initial values of our coefficients B0 and B1 and based on the error on each instance, we’ll update their values.

Here’s how it works:

Initially, let B1 = 0 and B0 = 0. Let L be our learning rate. This controls how much the value of B1 changes with each step. L could be a small value like 0.01 for good accuracy

We calculate the error for the first point: e(1) = p(1) – y(1)

We’ll update B0 and B1 according to the following equation:

b0(t+1) = b0(t) - L * error b1(t+1) = b1(t) - L * errorWe’ll do this for each instance of a training set. This completes one epoch. We’ll repeat this for more epochs to get more accurate predictions.

You can refer to these comprehensive guides to get a more in-depth intuition of linear regression and gradient descent:

Implementing Linear Regression in C++ Initialization phase:We’ll start by defining our dataset. For the scope of this tutorial, we’ll use this dataset:

We’ll train our dataset on the first 5 values and test on the last value:

View the code on Gist.

Next, we’ll define our variables:

View the code on Gist.

Training PhaseOur next step is the gradient descent algorithm:

View the code on Gist.

Since there are 5 values and we are running the whole algorithm for 4 epochs, hence 20 times our iterative function works. The variable p calculates the predicted value of each instance. The variable err is used for calculating the error of each instance. We then update the values of b0 and b1 as explained above in the gradient descent section above. We finally push the error in the error vector.

As you will notice, B0 does not have any input. This coefficient is often called the bias or the intercept and we can assume it always has an input value of 1.0. This assumption can help when implementing the algorithm using vectors or arrays.

Finally, we’ll sort the error vector to get the minimum value of error and corresponding values of b0 and b1. At last, we’ll print the values:

View the code on Gist.

Testing Phase:View the code on Gist.

We’ll enter the test value which is 6. The answer we get is 4.9753 which is quite close to 5. Congratulations! We just completed building a linear regression model with C++, and that too with good parameters.

Full Code for final implementation:View the code on Gist.

Logistic Regression with C++Logistic Regression is one of the most famous machine learning algorithms for binary classification. This is because it is a simple algorithm that performs very well on a wide range of problems.

The name of this algorithm is logistic regression because of the logistic function that we use in this algorithm. This logistic function is defined as:

predicted = 1 / (1 + e^-x) Gradient Descent for Logistic RegressionWe can apply stochastic gradient descent to the problem of finding the coefficients for the logistic regression model as follows:

Let us suppose for the example dataset, the logistic regression has three coefficients just like linear regression:

output = b0 + b1*x1 + b2*x2The job of the learning algorithm will be to discover the best values for the coefficients (b0, b1, and b2) based on the training data.

Given each training instance:

Calculate a prediction using the current values of the coefficients. prediction = 1 / (1 + e^(-(b0 + b1*x1 + b2*x2)).

Calculate new coefficient values based on the error in the prediction. The values are updated according to the below equation: b = b + alpha * (y – prediction) * prediction * (1 – prediction) * x

Where b is the coefficient we are updating and prediction is the output of making a prediction using the model. Alpha is a parameter that you must specify at the beginning of the training run. This is the learning rate and controls how much the coefficients (and therefore the model) changes or learns each time it is updated.

Like we saw earlier when talking about linear regression, B0 does not have any input. This coefficient is called the bias or the intercept and we can assume it always has an input value of 1.0. So while updating, we’ll multiply with 1.0.

The process is repeated until the model is accurate enough (e.g. error drops to some desirable level) or for a fixed number of iterations.

For a beginner’s guide to logistic regression, check this out – Simple Guide to Logistic Regression.

Implementing Logistic Regression in C++ Initialization phaseWe’ll start by defining the dataset:

We’ll train on the first 10 values and test on the last value:

View the code on Gist.

Next, we’ll initialize the variables:

View the code on Gist.

Training PhaseView the code on Gist.

Since there are 10 values, we’ll run one epoch that takes 10 steps. We’ll calculate the predicted value according to the equation as described above in the gradient descent section:

prediction = 1 / (1 + e^(-(b0 + b1*x1 + b2*x2)))Next, we’ll update the variables according to the similar equation described above:

b = b + alpha * (y – prediction) * prediction * (1 – prediction) * xFinally, we’ll sort the error vector to get the minimum value of error and corresponding values of b0, b1, and b2. And finally, we’ll print the values:

View the code on Gist.

Testing phase:View the code on Gist.

When we enter x1=7.673756466 and x2= 3.508563011, we get pred = 0.59985. So finally we’ll print the class:

View the code on Gist.

So the class printed by the model is 1. Yes! We got the prediction right!

Final Code for full implementationView the code on Gist.

One of the more important steps, in order to learn machine learning, is to implement algorithms from scratch. The simple truth is that if we are not familiar with the basics of the algorithm, we can’t implement that in C++.

Related

## 500 Internal Server Error On Youtube Explained

World’s top video-sharing website, YouTube can be, at times, down on occasions around the globe. At such times regular YouTube users complain of getting a continuous 500 Internal Server Error.

Frequently, YouTube users face 500 Internal Server Error problems for about an hour or two, which stop them from accessing their favorite videos. In fact, users get so annoyed by this error that they started tweeting about this at a rate of two tweets in a minute, all are asking about this 500 Internal Server Error, nothing else.

YouTube may face problems from time to time … these days, and a major among them is the case of piracy. Some experts are saying that this error occurred once because YouTube was trying to implement some new security features to stop piracy and other threats to the site. These changes might include stopping the users from downloading the videos, removal of copy-righted stuff, etc. So, this could cause server downtime resulting in a 500 Internal Server Error.

YouTube 500 Internal Server ErrorThe 500 Internal Server Error, which annoys YouTube users, is a general response error that occurs only when any request thrown off the server is not able to find its right path or not able to perform the particular task or when the root cause for the problem cannot be identified.

Here is some more info on 500 internal server errors:

The Web server (running the Web Site) encountered an unexpected condition that prevented it from fulfilling the request by the client (e.g. your Web browser) for access to the requested URL.

This is a ‘catch-all’ error generated by the Web server. Basically, something has gone wrong, but the server can not be more specific about the error condition in its response to the client. In addition to the 500 error notified back to the client, the Web server should generate some kind of internal error log which gives more details of what went wrong. It is up to the operators of the Web server site to locate and analyze these logs.

500 errors in the HTTP cycle:Any client (e.g. your Web browser) goes through the following cycle when it communicates with the Web server:

Open an IP socket connection to that IP address.

Write an HTTP data stream through that socket.

Receive an HTTP data stream back from the Web server in response. This data stream contains status codes whose values are determined by the HTTP protocol. Parse this data stream for status codes and other useful information.

This error occurs in the final step above when the client receives an HTTP status code that it recognizes as ‘500’.

Fixing 500 internal server errors:This error can only be resolved by fixes to the Web server software. It is not a client-side problem. It is up to the operators of the Web server site to locate and analyze the logs which should give further information about the error.

List of 5xx server errors:

500 Internal Server Error: A generic error message, given when no more specific message is suitable.

501 Not Implemented: The server either does not recognize the request method, or it lacks the ability to fulfill the request.

502 Bad Gateway: The server was acting as a gateway or proxy and received an invalid response from the upstream server.

503 Service Unavailable: The server is currently unavailable (because it is overloaded or down for maintenance). Generally, this is a temporary state.

504 Gateway Timeout: The server was acting as a gateway or proxy and did not receive a timely request from the upstream server.

505 HTTP Version Not Supported: The server does not support the HTTP protocol version used in the request.

506 Variant Also Negotiates (RFC 2295): Transparent content negotiation for the request, results in a circular reference.

507 Insufficient Storage (WebDAV) (RFC 4918)

509 Bandwidth Limit Exceeded (Apache bw/limited extension): This status code, while used by many servers, is not specified in any RFCs.

510 Not Extended (RFC 2774): Further extensions to the request are required for the server to fulfill it.

530 User access denied.

This post on how to improve YouTube Buffering, Performance & Speed on Windows PC may also interest you.

Read next: Fix YouTube Error 400 on PC.

## How To Watch Discovery Plus On Fire Tv: Download And Casting Methods Explained!

The good news is that Discovery Plus is available on a long list of devices. You do not need to be limited to the screens of your computer/iOS/Andriod device. You can stream Discovery Plus on your television. If you do not have a smart television you can still get the service with Amazon Fire TV. All Fire TV Sticks, Fire TV Cube, as well as Fire TV Edition televisions, support Discovery Plus. Here is how you can easily install and stream the new service on Amazon Fire TV.

To be able to play content from Discovery Plus, you need to first subscribe to it because it’s a premium service and you do need to pay for it (it’s free, in one case). Yes, skip this part if you are already subscribed to Discovery Plus.

Related: How to Sign Up for Discovery Plus: Step-by-step Guide With Pictures

To subscribe you have to choose the plan that you want and then proceed to create your account. If you already have an account then you have to enter your email and password to sign in.

While Discovery Plus is available on Amazon’s Fire TV, for now, there are plans of it becoming available on Amazon Prime Video Channel as well. The date of availability has not been announced yet. We just have to wait and watch for updates.

For now, all Amazon Fire TV users can stream Discovery network content using the new Discovery Plus app.

How to watch Discovery Plus on Amazon Fire TV or Fire TV Stick

There are basically two ways to watch Discovery Plus on your Fire TV or Fire TV stick. You can download the Discovery Plus app directly on your device, and the second option is to cast the content from Discovery Plus app or web on your mobile phone or PC to your device. Let’s explore both options.

Method #1: Download and install the app to watch Discovery Plus

Here is how you can get Fire TV to stream Discovery Plus content:

Option 1.1: Download on Amazon Fire TV device

You need to first find the Discovery Plus app on your television. You need to navigate to the Search section. Type “Discovery Plus” using the onscreen keyboard.

Once you select Discovery Plus you will be taken to the app’s home page.

Alternatively, you can do a voice search. Hold down the microphone button and say “Discovery Plus”. This too will take you to the app. From there you can download the app.

Once the app has downloaded and installed you can open it.

Option 1.2: Get the app for your Fire TV from Amazon

Aside from getting the app directly on your Fire TV device, you can also add the app from Amazon.

Go to chúng tôi and search for Discovery Plus.

On the right side of the screen, you can choose the device you want to get the app on.

After selecting the device you can get the app delivered to it.

When you boot your Amazon Fire TV device the app will automatically get installed.

1.2 Play content on your Fire TV device

Open the Discovery Plus app on your Fire TV device now. Sign in, if not done already, and you are good to play content using the device. That’s all.

Method #2: Cast content to Fire TV from your mobile app or web on PC

Make sure you have set up your Fire TV or Fire TV Stick, and that both your phone/PC and Fire TV device is connected to the same network.

Now, open the Discovery Plus app on your phone, or visit chúng tôi on PC/Mac, and play any video. Now, tap on the cast icon on your phone/PC.

Now, select the Fire TV device.

Once you do that, the video will start playing from your phone to your PC. Done.

Update the detailed information about **Ridge And Lasso Regression Explained** on the Daihoichemgio.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!