Trending December 2023 # Machine Learning In Cyber Security — Malicious Software Installation # Suggested January 2024 # Top 18 Popular

You are reading the article Machine Learning In Cyber Security — Malicious Software Installation updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Machine Learning In Cyber Security — Malicious Software Installation


Monitoring of user activities performed by local administrators is always a challenge for SOC analysts and security professionals. Most of the security framework will recommend the implementation of a whitelist mechanism.

However, the real world is often not ideal. You will always have different developers or users having local administrator rights to bypass controls specified. Is there a way to monitor the local administrator activities?

Let’s talk about the data source

An example of how the dataset looks like — the 3 entries listed above are referring to the same software

We have a regular batch job to retrieve the software installed on each of the workstations which are located in different regions. Most of the software installed is displayed in their local languages. (Yes, you name it — it could be Japanese, French, Dutch …..) So you will meet a situation that the software installed is displayed as 7 different names while it is referring to the same software in the whitelist. Not to mention, we have thousands of devices.

Attributes of the dataset

    Hostname — The hostname of the devices

    Publisher Name — the software publisher

    Software Name — Software Name in Local Language and different versions number

    Is there a way we could identify non-standard installation?

    My idea is that legit software used in the company — should have more than 1 installation and the software name should be different. In such a case, I believe it will be effective to use machine learning to help a user classify the software and highlight any outlier.

    Char processing using Term Frequency — Inverse Document Frequency (TF-IDF)

    TF-IDF is a statistical measure that evaluates how relevant a word is to a document in a collection of documents. This is done by multiplying two metrics: how many times a word appears in a document, and the inverse document frequency of the word across a set of documents.

    An example of the script below is going through how I apply the TF-IDF to the software name field in my data set.

    import pandas as pd from sklearn.feature_extraction.text import TfidfVectorizer

    # Import the dataset df=pd.read_csv("your dataset")

    # Extract the Manufacturer into List field_extracted = df['softwarename']

    # initialize the TF-IDF vectorizer = TfidfVectorizer(analyzer='char') vectors = vectorizer.fit_transform(field_extracted) feature_names = vectorizer.get_feature_names() dense = vectors.todense() denselist = dense.tolist() result = pd.DataFrame(denselist, columns=feature_names)

    A snippet of the result:

    The result from the TF-IDF scripts above (with a mix of different languages e.g. Korean, Chinese)

    In the above diagram, you could see that a calculation is performed to evaluate how “important” each char is on the software name. This could also be interpreted as how “many” of the char specified is available on each of the software names. In this way, you could present statistically on the characteristic of each “software name” and we could put these features into the machine learning model of your choice.

    Other features I extracted and believe it will also be meaning to the models:

      The entropy of the Software Name

      import math from collections import Counter

      # Function of calculating Entropy def eta(data, unit='natural'): base = { 'shannon' : 2., 'natural' : math.exp(1), 'hartley' : 10. }

      if len(data) <= 1: return 0

      counts = Counter()

      for d in data: counts[d] += 1

      ent = 0

      probs = [float(c) / len(data) for c in counts.values()] for p in probs: ent -= p * math.log(p, base[unit])

      return ent

      entropy = [eta(x) for x in field_extracted]

        Space Ratio — How many spaces the software name has

        Vowel Ratio — How many vowels (aeiou) the software name has

        At last, I have these features listed above with labels to run against randomtreeforest classifier. You could select any classifier of your choice as long as it could give you a satisfactory result.

        Thanks for reading!

        About the Author

        Elaine Hung

        Elaine is a machine learning enthusiast, digital forensic and incident response consultant. Interested in applying ML and NLP on cyber security topics.


        You're reading Machine Learning In Cyber Security — Malicious Software Installation

        Fraud Detection In Machine Learning

        Fraud Detection with Machine Learning is possible because of the ability of the models to learn from past fraud data to recognize patterns and predict the legitimacy of future transactions. In most cases, it’s more effective than humans due to the speed and efficiency of information processing. Some types of internet frauds are: 1. ID forgery. Nowadays IDs are fabricated so well that it’s almost impossible for humans to verify their legitimacy and prevent any identity fraud. Through the use of AI, various features of the ID card appearance can be analysed to give a result on the authenticity of the document. This allows companies to establish their own criteria for security when requests are made which require certain ID documents. 2. Bank loan scams. These may happen if a person contacts you and offers a loan scheme with suspiciously favourable conditions. Here the person contacting you will ask for your bank details or for payment upfront, without having any proper company information or even using an international contact number. Such frauds can easily be handled by AI using previous loan application records to filter out loan defaulters. 4. Credit card frauds. This is the most common type of payment fraud. This is because all details are stored online which makes it easier for criminals and hackers to access. Cards sent through mail can also be easily intercepted. One way to filter such fraud transactions using machine learning is discussed below. 5. Identity theft. Machine Learning for detecting identity theft helps checking valuable identity documents such as passports, PAN cards, or driver’s licenses in real-time. Moreover, biometric information can be sometimes required to improve security even more. These security methods need in-person authentication which decreases the chance of frauds to a great extent.  

        Model to predict fraud using credit card data:

        Here a very famous Kaggle dataset is used to demonstrate how fraud detection works using a simple neural network model. Imports:

        import pandas as pd import numpy as np import tensorflow as tf import keras from sklearn.preprocessing import StandardScaler from keras.models import Sequential from keras.layers import Dense from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report

          Have a look at the dataset

        data= pd.read_csv(‘creditcard.csv’) data[‘Amount_norm’] = StandardScaler().fit_transform(data[‘Amount’].values.reshape(-1,1)) data= data.drop([‘Amount’],axis=1) data= data.drop([‘Time’],axis=1) data= data[:-1]

          Now after some data cleaning, our dataset contains a total of 28 features and one target, all having float values which are not empty.   Our target is the Class column which determines whether the particular credit card transaction is fraud or not. So the dataset is divided accordingly into train and test, keeping the usual 80:20 split ratio. (random_state is fixed to help you reproduce your split data)

        X = data.iloc[:, data.columns != ‘Class’] y = data.iloc[:, data.columns == ‘Class’]

        X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state=0)

          We use the sequential model from keras library to build a neural network with 3 dense layers. The output layer contains only a single neuron which will use the sigmoid function to result in either a positive class or a negative class. The model is then compiled with adam optimizer, though it is highly suggested that you try out different values of hyper parameters by yourself, such as the number of units in each layer, activation, optimizer, etc. to see what works best for a given dataset.

        model= Sequential() model.add(Dense(units= 16 , activation = ‘relu’, input_dim = 29)) model.add(Dense(units= 16, activation = ‘relu’)) model.add(Dense(units= 1, activation = ‘sigmoid’))

  , y_train, batch_size = 32, epochs = 15)

          This is the result after running the model for a few epochs. We see that the model gives 99.97% accuracy very fast. Below, y_pred contains the predictions made by our model on the test data, and a neat summary of its performance is shown.

        y_pred = model.predict(X_test)   print(classification_report(y_test, y_pred))


        So this way we were successfully able to build a highly accurate model to determine fraudulent transactions. These come in very handy for risk management purposes.  

        Author Bio:

        Maximum Likelihood In Machine Learning


        In this article, we will discuss the likelihood function, the core idea behind that, and how it works with code examples. This will help one to understand the concept better and apply the same when needed.

        Let us dive into the likelihood first to understand the maximum likelihood estimation.

        What is the Likelihood?

        In machine learning, the likelihood is a measure of the data observations up to which it can tell us the results or the target variables value for particular data points. In simple words, as the name suggests, the likelihood is a function that tells us how likely the specific data point suits the existing data distribution.

        For example. Suppose there are two data points in the dataset. The likelihood of the first data point is greater than the second. In that case, it is assumed that the first data point provides accurate information to the final model, hence being likable for the model being informative and precise.

        After this discussion, a gentle question may appear in your mind, If the working of the likelihood function is the same as the probability function, then what is the difference?

        Difference Between Probability and Likelihood

        Although the working and intuition of both probability and likelihood appear to be the same, there is a slight difference, here the possibility is a function that defines or tells us how accurate the particular data point is valuable and contributes to the final algorithm in data distribution and how likely is to the machine learning algorithm.

        Whereas probability, in simple words is a term that describes the chance of some event or thing happening concerning other circumstances or conditions, mostly known as conditional probability.

        Also, the sum of all the probabilities associated with a particular problem is one and can not exceed it, whereas the likelihood can be greater than one.

        What is Maximum Likelihood Estimation?

        After discussing the intuition of the likelihood function, it is clear to us that a higher likelihood is desired for every model to get an accurate model and has accurate results. So here, the term maximum likelihood represents that we are maximizing the likelihood function, called the Maximization of the Likelihood Function.

        Let us try to understand the same with an example.

        Let us suppose that we have a classification dataset in which the independent column is the marks of the students that they achieved in the particular exam, and the target or dependent column is categorical, which has yes and No attributes representing if students are placed on the campus placements or not.

        Noe here, if we try to solve the same problem with the help of maximum likelihood estimation, the function will first calculate the probability of every data point according to every suitable condition for the target variable. In the next step, the function will plot all the data points in the two-dimensional plots and try to find the line that best fits the dataset to divide it into two parts. Here the best-fit line will be achieved after some epochs, and once achieved, the line is used to classify the data point by simply plotting it to the graph.

        Maximum Likelihood: The Base

        The maximum likelihood estimation is a base of some machine learning and deep learning approaches used for classification problems. One example is logistic regression, where the algorithm is used to classify the data point using the best-fit line on the graph. The same approach is known as the perceptron trick regarding deep learning algorithms.

        As shown in the above image, all the data observations are plotted in a two-dimensional diagram where the X-axis represents the independent column or the training data, and the y-axis represents the target variable. The line is drawn to separate both data observations, positives and negatives. According to the algorithm, the observations that fall above the line are considered positive, and data points below the line are regarded as negative data points.

        Maximum Likelihood Estimation: Code Example

        We can quickly implement the maximum likelihood estimation technique using logistic regression on any classification dataset. Let us try to implement the same.


















        LogisticRegression lr













































        The above code will fit the logistic regression for the given dataset and generate the line plot for the data representing the distribution of the data and the best fit according to the algorithm.

        Key Takeaways

        Maximum Likelihood is a function that describes the data points and their likeliness to the model for best fitting.

        Maximum likelihood is different from the probabilistic methods, where probabilistic methods work on the principle of calculation probabilities. In contrast, the likelihood method tries o maximize the likelihood of data observations according to the data distribution.

        Maximum likelihood is an approach used for solving the problems like density distribution and is a base for some algorithms like logistic regression.

        The approach is very similar and is predominantly known as the perceptron trick in terms of deep learning methods.


        In this article, we discussed the likelihood function, maximum likelihood estimation, its core intuition, and working mechanism with practical examples associated with some key takeaways. This will help one understand the maximum likelihood better and more deeply and help answer interview questions related to the same very efficiently.

        Why Cyber Security Is Important?

        Why Cyber Security Is Important?

        Uprising Cost of Breaches

        Certainly, cybercriminals are benefited with every targeting firm and person. With every attack not only, the firms are losing their funds, but this is also increasing confidence of hackers. If we talk about the recent statistics, the average cost of a single data breach for the wealthy organization is around £20,000. However, the real damage is not financial losses but the cost that will be included for remedies and fixing the reputation of the company.

        Easy Access to Hacking Tools

        Hackers use sophisticated devices and tools to get in a company’s system or network. The more skilled and well-funded a hacker is, the more he poses a risk to a company. Also, the wide accessibility of hacking tools on the internet gives room to novice hackers to learn a trick or two.

        Previous Data Breaches

        Hackers are skilled and know their work in and out. They are using internet resources to get the most of them. They use prior data breaches information and gained all the required data. In fact, the previous data breaches provide an uncountable number of well-funded and coordinated cyber-attacks against the big organizations. If we talk about the big company, then Deloitte (world’s largest cybersecurity consultant organization) fell in the 2023 attack.

        It seems like invading the privacy of big companies seems to be a joke to hackers. To prevent these data breaches, companies need to work on the security measures, patchwork and hire professionals to do damage control, if in case any.

        Things that can be affected by a breach of security:

        Education system: grades, history, research information and credit scores.

        Health system: medical records, history and equipment.

        Financial systems: paychecks, loan and bank accounts.

        Government system: database, tax records, licenses, Social Security no.

        Communication systems:  sensitive information, client information, email, contact numbers and messages.

        What’s the Prevention?

        To prevent yourself and your company, you need to know about the attack and its risk. You can find the loopholes available in the security system of the organization and fix them with patchwork. So, here are a few things that will help you to understand why cybersecurity is important and how to stay secure.

        Do not open an attachment that you’ve received from untrusted sources.

        Some of the attachments are created with the purpose to gain access to the machine. In fact, phishing emails are one of the biggest reasons for the hacker’s success.

        Never ever access an email that directly appears into the spam folder.

        Emails that you get in spam folder can be infected and hackers can trick you to get into your system. It is recommended to access the emails that you’ve received from known sources.

        Use reliable anti-malware program

        Whenever you get pop-up messages on your screen that asks you to call because there is a virus on your system, then you need to ignore such messages. Hackers sometimes pretend like they are from some technical support and wants to help you regarding your system. It is not like all the technical support are bogus, some of them are legitimate companies that are working to provide genuine help. However, you need to stay alert while contacting one.

        The bottom line:

        Every business needs to fight cybercrime by strengthening the security of the business. You need to take proper security measures to stay protected against cybersecurity threats. You can hire a professional or IT department to work on the vulnerabilities.

        We hope to you will find this article useful and understand the need of cyber security.

        Quick Reaction:

        About the author

        Preeti Seth

        Understanding Distance Metrics Used In Machine Learning

        Clustering is an important part of data cleaning, used in the field of artificial intelligence, deep learning, and data science. Today we are going to discuss distance metrics, which is the backbone of clustering. Distance metrics basically deal with finding the proximity or distance between data points and determining if they can be clustered together. In this article, we will walk through 4 types of distance metrics in machine learning and understand how they work in Python.

        Learning objectives

        In this tutorial, you will learn about the use cases of various distance metrics.

        You will also learn about the different types of learning metrics.

        Lastly, you will learn about the important role distance metrics play in data mining.

        What Are Distance Metrics?

        Distance metrics are a key part of several machine learning algorithms. These distance metrics are used in both supervised and unsupervised learning, generally to calculate the similarity between data points. An effective distance metric improves the performance of our machine learning model, whether that’s for classification tasks or clustering.

        Let’s say you need to create clusters using a clustering algorithm such as K-Means Clustering or k-nearest neighbor algorithm (knn), which uses nearest neighbors to solve a classification or regression problem. How will you define the similarity between different observations? How can we say that two points are similar to each other? This will happen if their features are similar, right? When we plot these points, they will be closer to each other by distance.

        Hence, we can calculate the distance between points and then define the similarity between them. Here’s the million-dollar question – how do we calculate this distance, and what are the different distance metrics in machine learning? Also, are these metrics different for different learning problems? Do we use any special theorem for this? These are all questions we are going to answer in this article.

        Types of Distance Metrics in Machine Learning

        Euclidean Distance

        Manhattan Distance

        Minkowski Distance

        Hamming Distance

        Let’s start with the most commonly used distance metric – Euclidean Distance.

        Euclidean Distance

        Euclidean Distance represents the shortest distance between two chúng tôi is the square root of the sum of squares of differences between corresponding elements.

        The Euclidean distance metric corresponds to the L2-norm of a difference between vectors and vector spaces. The cosine similarity is proportional to the dot product of two vectors and inversely proportional to the product of their magnitudes.

        Most machine learning algorithms, including K-Means use this distance metric to measure the similarity between observations. Let’s say we have two points, as shown below:

        So, the Euclidean Distance between these two points, A and B, will be:

        Formula for Euclidean Distance

        We use this formula when we are dealing with 2 dimensions. We can generalize this for an n-dimensional space as:


        n = number of dimensions

        pi, qi = data points

        Let’s code Euclidean Distance in Python. This will give you a better understanding of how this distance metric works.

        We will first import the required libraries. I will be using the SciPy library that contains pre-written codes for most of the distance functions used in Python:

        View the code on Gist.

        These are the two sample points that we will be using to calculate the different distance functions. Let’s now calculate the Euclidean Distance between these two points:

        Python Code


        This is how we can calculate the Euclidean Distance between two points in Python. Let’s now understand the second distance metric, Manhattan Distance.

        Manhattan Distance

        Manhattan Distance is the sum of absolute differences between points across all the dimensions.

        We can represent Manhattan Distance as:

        Formula for Manhattan Distance

        Since the above representation is 2 dimensional, to calculate Manhattan Distance, we will take the sum of absolute distances in both the x and y directions. So, the Manhattan distance in a 2-dimensional space is given as:

        And the generalized formula for an n-dimensional space is given as:


        n = number of dimensions

        pi, qi = data points

        Now, we will calculate the Manhattan Distance between the two points:

        View the code on Gist.

        Note that Manhattan Distance is also known as city block distance. SciPy has a function called cityblock that returns the Manhattan Distance between two points.

        Let’s now look at the next distance metric – Minkowski Distance.

        Minkowski Distance

        Minkowski Distance is the generalized form of Euclidean and Manhattan Distance.

        Formula for Minkowski Distance

        Here, p represents the order of the norm. Let’s calculate the Minkowski Distance formula of order 3:

        View the code on Gist.

        The p parameter of the Minkowski Distance metric of SciPy represents the order of the norm. When the order(p) is 1, it will represent Manhattan Distance and when the order in the above formula is 2, it will represent Euclidean Distance.

        Python Code

        View the code on Gist.

        Here, you can see that when the order is 1, both Minkowski and Manhattan Distance are the same. Let’s verify the Euclidean Distance as well:

        View the code on Gist.

        When the order is 2, we can see that Minkowski and Euclidean distances are the same.

        So far, we have covered the distance metrics that are used when we are dealing with continuous or numerical variables. But what if we have categorical variables? How can we decide the similarity between categorical variables? This is where we can make use of another distance metric called Hamming Distance.

        Hamming Distance

        Hamming Distance measures the similarity between two strings of the same length. The Hamming Distance between two strings of the same length is the number of positions at which the corresponding characters are different.

        Let’s understand the concept using an example. Let’s say we have two strings:

        “euclidean” and “manhattan”

        Since the length of these strings is equal, we can calculate the Hamming Distance. We will go character by character and match the strings. The first character of both the strings (e and m, respectively) is different. Similarly, the second character of both the strings (u and a) is different. and so on.

        Look carefully – seven characters are different, whereas two characters (the last two characters) are similar:

        Hence, the Hamming Distance here will be 7. Note that the larger the Hamming Distance between two strings, the more dissimilar those strings will be (and vice versa).

        Python Code

        Let’s see how we can compute the Hamming Distance of two strings in Python. First, we’ll define two strings that we will be using:

        View the code on Gist.

        These are the two strings “euclidean” and “manhattan”, which we have seen in the example as well. Let’s now calculate the Hamming distance between these two strings:

        View the code on Gist.

        As we saw in the example above, the Hamming Distance between “euclidean” and “manhattan” is 7. We also saw that Hamming Distance only works when we have strings of the same length.

        Let’s see what happens when we have strings of different lengths:

        View the code on Gist.

        You can see that the lengths of both the strings are different. Let’s see what will happen when we try to calculate the Hamming Distance between these two strings:

        View the code on Gist.

        This throws an error saying that the lengths of the arrays must be the same. Hence, Hamming distance only works when we have strings or arrays of the same length.

        These are some of the most commonly used similarity measures or distance matrices in Machine Learning.


        Distance metrics are a key part of several machine learning algorithms. They are used in both supervised and unsupervised learning, generally to calculate the similarity between data points. Therefore, understanding distance measures is more important than you might realize. Take k-NN, for example – a technique often used for supervised learning. By default, it often uses euclidean distance, a great distance measure, for clustering.

        By grasping the concept of distance metrics and their mathematical properties, data scientists can make informed decisions in selecting the appropriate metric for their specific problem. Our BlackBelt program provides comprehensive training in machine learning concepts, including distance metrics, empowering learners to become proficient in this crucial aspect of data science. Enroll in our BlackBelt program today to enhance your skills and take your data science expertise to the next level.

        Key Takeaways

        Distance metrics are used in supervised and unsupervised learning to calculate similarity in data points.

        They improve the performance, whether that’s for classification tasks or clustering.

        The four types of distance metrics are Euclidean Distance, Manhattan Distance, Minkowski Distance, and Hamming Distance.

        Frequently Asked Questions

        Q1. What is the L1 L2 distance metric?

        A. The L1 is calculated as the sum of the absolute values of the vector. The L2 norm is calculated as the square root of the sum of squared vector values.

        Q2. What distance metrics are used in KNN?

        A. Euclidean distance, cosine similarity measure, Minkowsky, correlation, and Chi-square, are used in the k-NN classifier.

        Q3. What is a distance metric in clustering?

        A. Distance metric is what most algorithms, such as K-Means and KNN, use for clustering.


        Ai Vs. Machine Learning Vs. Deep Learning

        Since before the dawn of the computer age, scientists have been captivated by the idea of creating machines that could behave like humans. But only in the last decade has technology enabled some forms of artificial intelligence (AI) to become a reality.

        Interest in putting AI to work has skyrocketed, with burgeoning array of AI use cases. Many surveys have found upwards of 90 percent of enterprises are either already using AI in their operations today or plan to in the near future.

        Eager to capitalize on this trend, software vendors – both established AI companies and AI startups – have rushed to bring AI capabilities to market. Among vendors selling big data analytics and data science tools, two types of artificial intelligence have become particularly popular: machine learning and deep learning.

        While many solutions carry the “AI,” “machine learning,” and/or “deep learning” labels, confusion about what these terms really mean persists in the market place. The diagram below provides a visual representation of the relationships among these different technologies:

        As the graphic makes clear, machine learning is a subset of artificial intelligence. In other words, all machine learning is AI, but not all AI is machine learning.

        Similarly, deep learning is a subset of machine learning. And again, all deep learning is machine learning, but not all machine learning is deep learning.

        Also see: Top Machine Learning Companies

        AI, machine learning and deep learning are each interrelated, with deep learning nested within ML, which in turn is part of the larger discipline of AI.

        Computers excel at mathematics and logical reasoning, but they struggle to master other tasks that humans can perform quite naturally.

        For example, human babies learn to recognize and name objects when they are only a few months old, but until recently, machines have found it very difficult to identify items in pictures. While any toddler can easily tell a cat from a dog from a goat, computers find that task much more difficult. In fact, captcha services sometimes use exactly that type of question to make sure that a particular user is a human and not a bot.

        In the 1950s, scientists began discussing ways to give machines the ability to “think” like humans. The phrase “artificial intelligence” entered the lexicon in 1956, when John McCarthy organized a conference on the topic. Those who attended called for more study of “the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

        Critics rightly point out that there is a big difference between an AI system that can tell the difference between cats and dogs and a computer that is truly intelligent in the same way as a human being. Most researchers believe that we are years or even decades away from creating an artificial general intelligence (also called strong AI) that seems to be conscious in the same way that humans beings are — if it will ever be possible to create such a system at all.

        If artificial general intelligence does one day become a reality, it seems certain that machine learning will play a major role in the system’s capabilities.

        Machine learning is the particular branch of AI concerned with teaching computers to “improve themselves,” as the attendees at that first artificial intelligence conference put it. Another 1950s computer scientist named Arthur Samuel defined machine learning as “the ability to learn without being explicitly programmed.”

        In traditional computer programming, a developer tells a computer exactly what to do. Given a set of inputs, the system will return a set of outputs — just as its human programmers told it to.

        Machine learning is different because no one tells the machine exactly what to do. Instead, they feed the machine data and allow it to learn on its own.

        In general, machine learning takes three different forms: 

        Reinforcement learning is one of the oldest types of machine learning, and it is very useful in teaching a computer how to play a game.

        For example, Arthur Samuel created one of the first programs that used reinforcement learning. It played checkers against human opponents and learned from its successes and mistakes. Over time, the software became much better at playing checkers.

        Reinforcement learning is also useful for applications like autonomous vehicles, where the system can receive feedback about whether it has performed well or poorly and use that data to improve over time.

        Supervised learning is particularly useful in classification applications such as teaching a system to tell the difference between pictures of dogs and pictures of cats.

        In this case, you would feed the application a whole lot of images that had been previously tagged as either dogs or cats. From that training data, the computer would draw its own conclusions about what distinguishes the two types of animals, and it would be able to apply what it learned to new pictures.

        By contrast, unsupervised learning does not rely on human beings to label training data for the system. Instead, the computer uses clustering algorithms or other mathematical techniques to find similarities among groups of data.

        Unsupervised machine learning is particularly useful for the type of big data analytics that interests many enterprise leaders. For example, you could use unsupervised learning to spot similarities among groups of customers and better target your marketing or tailor your pricing.

        Some recommendation engines rely on unsupervised learning to tell people who like one movie or book what other movies or books they might enjoy. Unsupervised learning can also help identify characteristics that might indicate a person’s credit worthiness or likelihood of filing an insurance claim.

        Various AI applications, such as computer vision, natural language processing, facial recognition, text-to-speech, speech-to-text, knowledge engines, emotion recognition, and other types of systems, often make use of machine learning capabilities. Some combine two or more of the main types of machine learning, and in some cases, are said to be “semi-supervised” because they incorporate some of the techniques of supervised learning and some of the techniques of unsupervised learning. And some machine learning techniques — such as deep learning — can be supervised, unsupervised, or both.

        The phrase “deep learning” first came into use in the 1980s, making it a much newer idea than either machine learning or artificial intelligence.

        Deep learning describes a particular type of architecture that both supervised and unsupervised machine learning systems sometimes use. Specifically, it is a layered architecture where one layer takes an input and generates an output. It then passes that output on to the next layer in the architecture, which uses it to create another output. That output can then become the input for the next layer in the system, and so on. The architecture is said to be “deep” because it has many layers.

        To create these layered systems, many researchers have designed computing systems modeled after the human brain. In broad terms, they call these deep learning systems artificial neural networks (ANNs). ANNs come in several different varieties, including deep neural networks, convolutional neural networks, recurrent neural networks and others. These neural networks use nodes that are similar to the neurons in a human brain.

        However, those GPUs also excel at the type of calculations necessary for deep learning. As GPU performance has improved and costs have decreased, people have been able to create high-performance systems that can complete deep learning tasks in much less time and for much less cost than would have been the case in the past.

        Today, anyone can easily access deep learning capabilities through cloud services like Amazon Web Services, Microsoft Azure, Google Cloud and IBM Cloud.

        If you are interested in learning more about AI vs machine learning vs deep learning, Datamation has several resources that can help, including the following:

        Update the detailed information about Machine Learning In Cyber Security — Malicious Software Installation on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!