Trending December 2023 # Applications Of Convolutional Neural Networks(Cnn) # Suggested January 2024 # Top 17 Popular

You are reading the article Applications Of Convolutional Neural Networks(Cnn) updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Applications Of Convolutional Neural Networks(Cnn)

This article was published as a part of the Data Science Blogathon

What is CNN?

Convolutional Neural Network is a type of deep learning neural network that is artificial. It is employed in computer vision and image recognition. This procedure includes the following steps:

OCR and image recognition

Detecting objects in self-driving cars

Social media face recognition

Image analysis in medicine

The term “convolutional” refers to a mathematical function that is created by integrating two different functions. It usually involves multiplying various elements to combine them into a coherent whole. Convolution describes how the shape of one function is influenced by another function. In other words, it is all about the relationships between elements and how they work together.

Sort visual content (explain what they “see”)

Recognize the objects in the scenery (for example, eyes, nose, lips, ears on the face)

Form groups of recognized objects (for e.g., eyes with eyes, noses with noses)

Another prominent use of CNNs is in laying the groundwork for various types of data analysis.

CNN classifies and clusters unusual elements such as letters and numbers using Optical Character Recognition (OCR). Optical Character Recognition combines these elements into a logical whole. CNN is also used to recognize and transcribe spoken words.

CNN’s classification capabilities are used in the sentiment analysis operation.

Let us now go over the mechanics of the Convolutional Neural Network.

How Does Convolutional Neural Network work?

Convolutional Neural Network structure consists of four layers:

Convolutional layer

The convolutional layer is where the action begins. The convolutional layer is designed to discover image features. Usually, it progresses from the general (i.e., shapes) to specific (i.e., identifying elements of an object, recognizing the face of a certain man, etc.).

Rectified Linear Unit layer (aka ReLu)

This layer is considered as an extension of a convolutional layer. The goal of ReLu is to increase the image’s non-linearity. It is the technique of removing excess fat from a picture in order to improve feature extraction.

Pooling layer

The pooling layer is used to minimize the number of input parameters, i.e., to conduct regression. In other words, it focuses on the most important aspects of the information obtained.

Connected layer

It is a standard feed-forward neural network. It’s the last straight line before the finish line, where everything is already visible. It’s only a matter of time until the results are confirmed.

Applications of Convolutional Neural Networks Image Classification – Search Engines, Social Media, Recommender Systems

The CNN picture categorization serves the following purposes:

Deconstruct an image and find its distinguishing feature. The system employs a supervised machine learning classification algorithm for this purpose.

Reduces the description of its important credentials. It’s done with the help of an unsupervised machine learning algorithm.

Image tagging

The most basic type of image classification algorithm is image tagging. The image tag is a term or a phrase that describes the images and makes them easier to find. This method is used by big companies like Facebook, Google, and Amazon. It is also one of the fundamental elements of visual search. Tagging involves recognition of objects and even sentiment analysis of the image tone.

Visual Search

This method involves comparing an input image to the access database. Furthermore, the visual search evaluates the image and searches for other photos that have comparable credentials.

Recommender engines

Another field where image classification and object identification can be used is recommender engines. Amazon, for example, employs CNN image recognition to make suggestions in the “you might also like” area. The presumption is based on the user’s expressed behavior. The products are matched based on visual criteria, such as red shoes and red lipstick for a red outfit. Pinterest employs CNN image recognition in a novel way. The organization focuses on visual credentials matching, which results in simple visual matching enhanced by tagging.

Face Recognition RNN Applications include Social Media, Identification, and Surveillance

Face recognition deserves its own section. This subset of image recognition deals with more complex images. Such images could include human faces or other living beings such as animals, fish, and insects.

The distinction between straight image recognition and face recognition is based on operational complexity — the additional layer of work required.

The shape of the face and its features are recognized first, followed by basic object recognition.

The features of the face are then examined further to determine its essential credentials. For example, It could be the shape of the nose, the skin tone, and texture, or the presence of scars, hair, or other surface irregularities.

The sum of these credentials is then calculated into the image data perception of a specific human being’s appearance. This procedure entails studying a large number of samples that each present the subject in a different way. For instance, whether with or without sunglasses).

The input image is then compared to the database, and the system recognizes a specific face.

Face recognition is used in social media platforms such as Facebook for both social networking and entertainment.

Face recognition in social networking serves to streamline the often dubious process of tagging people in photos. This feature is especially useful when you need to tag through hundreds of images from a conference or when there are far too many faces to tag. So, if you’re planning to build a social network, keep this feature in mind.

Facial detection in entertainment lays the groundwork for further transformations and manipulations. The most notable examples are Facebook Messenger filters and Snap chat Looksery filters. The filters depart from the face’s auto-generated basic layout and add new elements or effects.

Facial recognition technology is gaining traction as a viable method of personal identification.

Face recognition cannot be used to verify a persona in the same way that fingerprints and legal documents can. In cases where there is limited information, face recognition can be useful in identifying the person. For instance, from surveillance camera footage or a covert video recording.

Medical Image Computing – Predictive Analytics, Healthcare Data Science

Healthcare is the industry where all of the cutting-edge technology is put to the test.

If you want to test the usefulness of a certain technology, try employing it in a healthcare setting. Image recognition is no exception.

The most fascinating image recognition CNN use case is medical image computing.

The medical image includes a whole lot of further data analysis that arises from initial image recognition.

CNN medical image classification detects anomalies in X-ray and MRI images with better accuracy than the human eye.

These systems can display the series of photos as well as the differences between them. This feature lays the groundwork for future predictive analytics.

Medical image classification is based on massive datasets such as Public Health Records. It serves as a training basis for the algorithms and patients’ confidential data and test results. They work together to create an analytical platform that monitors the current status of the patient and forecasts results.

Health Risk Assessment Using Predictive Analytics

Convolutional Neural Network Predictive Analytics is used in this field.

Here’s how CNN Health Risk Assessment works:

CNN uses a grid topology approach to process data, which is a set of spatial correlations between data points. The grid is two-dimensional in the case of images. The grid is one-dimensional in the case of time series textual data.

The convolution algorithm is then used to identify some aspects of the input.

Take into account the variations of input.

Determine variable interactions that are sparse.

Use the same settings for all of a model’s functions.

Health Risk Assessment applications are a broad category, so we’ll focus on the most notable:

HRA is a predictive application that computes the likelihood of specific events. Based on patient data, this use case includes disease progression or complications. It looks for similar PHR, analyses the patient’s data, looks for patterns, and calculates potential outcomes. This system can be used for routine health checks.

The framework can be expanded by including a treatment plan. In this case, the prediction determines the best way to treat the symptoms.

The HRA system can also be used to investigate the specific environment and identify potential hazards for those who work there. This method is used to assess dangerous situations. In Australia, for example, officials are studying sun activity to determine the level of radiation threat.

Drug Discovery Using Predictive Analytics

Another major healthcare field that makes extensive use of CNNs is drug discovery. It is also one of the most inventive uses of convolutional neural networks in general.

RNN (Recurrent Neural Network) and stock market prediction are examples of pure data tweaking, whereas drug discovery and CNN are not.

The problem is that drug discovery and development is a time-consuming and costly process. In drug discovery, scalability and cost-effectiveness are critical.

The process of developing new drugs lends itself well to the implementation of neural networks. During the development of a new drug, there is a large amount of data to consider.

The following stages are involved in the drug discovery process:

This is a clustering and classification problem involving the analysis of observed medical effects.

Machine learning anomaly detection may be useful in hit discovery. The algorithm searches the compound database for new activities that can be used for specific purposes.

Then, using the Hit to Lead process, the results are narrowed down to the most relevant. That’s what dimensionality reduction and regression are all about.

Then there’s Lead Optimization, which is the process of combining and testing lead compounds to find the best approaches to them. The stages involve the examination of the organism’s chemical and physical effects.

Following that, the development shifts to live testing. Machine learning algorithms were relegated to the background and were used to structure incoming data.

CNN optimizes and streamlines the drug discovery process at critical stages. It allows for a reduction in the time required to develop cures for emerging diseases.

Precision Medicine Using Predictive Analytics

A similar approach can be used with existing drugs when developing a treatment plan for patients. Precision medicine aims to find the most effective way to treat a disease.

Supply chain management, predictive analytics, and user modeling are all part of precision medicine.

This is how it works:

From the standpoint of data, the patient is a collection of states that are affected by a variety of factors (symptoms and treatments).

The addition of variables (treatment types) has specific effects in both the short and long term.

Each variable has its own set of statistics regarding its impact on a symptom.

Data is combined to form an assumption about the best course of action based on the available information.

The various outcomes and changes in the patient’s condition are then considered. This is how the assumption is validated. This stage is handled by recurrent neural networks because it necessitates the analysis of data point sequences.


Convolutional Neural Networks reveal and describe hidden data in an understandable manner.

Even in their most basic uses, neural networks demonstrate how much can be accomplished with their assistance. The manner in which CNN recognizes photographs reveals a great deal about the composition and execution of the visuals. Convolutional Neural Networks, on the other hand, uncover novel medications, which is just one of many amazing examples of how artificial neural networks are making the world a better place.

Hope you like the article. If you want to connect with me then you can connect on:


or for any other doubts, you can send a mail to me also

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.


You're reading Applications Of Convolutional Neural Networks(Cnn)

Deep Learning For Computer Vision – Introduction To Convolution Neural Networks


The power of artificial intelligence is beyond our imagination. We all know robots have already reached a testing phase in some of the powerful countries of the world. Governments, large companies are spending billions in developing this ultra-intelligence creature. The recent existence of robots have gained attention of many research houses across the world.

Does it excite you as well ? Personally for me, learning about robots & developments in AI started with a deep curiosity and excitement in me! Let’s learn about computer vision today.

The earliest research in computer vision started way back in 1950s. Since then, we have come a long way but still find ourselves far from the ultimate objective. But with neural networks and deep learning, we have become empowered like never before.

Applications of deep learning in vision have taken this technology to a different level and made sophisticated things like self-driven cars possible in near future. In this article, I will also introduce you to Convolution Neural Networks which form the crux of deep learning applications in computer vision.

Note: This article is inspired by Stanford’s Class on Visual Recognition. Understanding this article requires prior knowledge of Neural Networks. If you are new to neural networks, you can start here. Another useful resource on basics of deep learning can be found here.

You can also learn Convolutional neural Networks in a structured and comprehensive manner by enrolling in this free course: Convolutional Neural Networks (CNN) from Scratch

Table of Contents

Challenges in Computer Vision

Overview of Traditional Approaches

Review of Neural Networks Fundamentals

Introduction to Convolution Neural Networks

Case Study: Increasing power of of CNNs in IMAGENET competition

Implementing CNNs using GraphLab (Practical in Python)

1. Challenges in Computer Vision (CV)

As the name suggests, the aim of computer vision (CV) is to imitate the functionality of human eye and brain components responsible for your sense of sight.

Doing actions such as recognizing an animal, describing a view, differentiating among visible objects are really a cake-walk for humans. You’d be surprised to know that it took decades of research to discover and impart the ability of detecting an object to a computer with reasonable accuracy.

Let’s get familiar with it a bit more:

Object detection is considered to be the most basic application of computer vision. Rest of the other developments in computer vision are achieved by making small enhancements on top of this. In real life, every time we(humans) open our eyes, we unconsciously detect objects.

Since it is super-intuitive for us, we fail to appreciate the key challenges involved when we try to design systems similar to our eye. Lets start by looking at some of the key roadblocks:

Variations in Viewpoint

The same object can have different positions and angles in an image depending on the relative position of the object and the observer.

There can also be different positions. For instance look at the following images:

Though its obvious to know that these are the same object, it is not very easy to teach this aspect to a computer (robots or machines).

Difference in Illumination

Though this image is so dark, we can still recognize that it is a cat. Teaching this to a computer is another challenge.

Hidden parts of images

Here, only the face of the puppy is visible and that too partially, posing another challenge for the computer to recognize.

Background Clutter

If you observe carefully, you can find a man in this image. As simple as it looks, it’s an uphill task for a computer to learn.

These are just some of the challenges which I brought up so that you can appreciate the complexity of the tasks which your eye and brain duo does with such utter ease. Breaking up all these challenges and solving individually is still possible today in computer vision. But we’re still decades away from a system which can get anywhere close to our human eye (which can do everything!).

This brilliance of our human body is the reason why researchers have been trying to break the enigma of computer vision by analyzing the visual mechanics of humans or other animals. Some of the earliest work in this direction was done by Hubel and Weisel with their famous cat experiment in 1959. Read more about it here.

This was the first study which emphasized the importance of edge detection for solving the computer vision problem. They were rewarded the nobel prize for their work.

Before diving into convolutional neural networks, lets take a quick overview of the traditional or rather elementary techniques used in computer vision before deep learning became popular.

2. Overview of Traditional Approaches

Various techniques, other than deep learning are available enhancing computer vision. Though, they work well for simpler problems, but as the data become huge and the task becomes complex, they are no substitute for deep CNNs. Let’s briefly discuss two simple approaches.

KNN (K-Nearest Neighbours)

Each image is matched with all images in training data. The top K with minimum distances are selected. The majority class of those top K is predicted as output class of the image.

Various distance metrics can be used like L1 distance (sum of absolute distance), L2 distance (sum of squares), etc.


Here the same dog is on right side in first image and left side in second. Though its the same image, KNN would give highly non-zero distance for the 2 images.

Similar to above, other challenges mentioned in section 1 will be faced by KNN.

Linear Classifiers

They use a parametric approach where each pixel value is considered as a parameter.

It’s like a weighted sum of the pixel values with the dimension of the weights matrix depending on the number of outcomes.

Intuitively, we can understand this in terms of a template. The weighted sum of pixels forms a template image which is matched with every image. This will also face difficulty in overcoming the challenges discussed in section 1 as single template is difficult to design for all the different cases.

I hope this gives some intuition into the challenges faced by approaches other than deep learning. Please note that more sophisticated techniques can be used than the ones discussed above but they would rarely beat a deep learning model.

3. Review of Neural Networks Fundamentals

Let’s discuss some properties of a neural networks. I will skip the basics of neural networks here as I have already covered that in my previous article – Fundamentals of Deep Learning – Starting with Neural Networks.

Once your fundamentals are sorted, let’s learn in detail some important concepts such as activation functions, data preprocessing, initializing weights and dropouts.

Activation Functions

There are various activation functions which can be used and this is an active area of research. Let’s discuss some of the popular options:

Sigmoid Function

Sigmoid activation, also used in logistic regression regression, squashes the input space from (-inf,inf) to (0,1)

But it has various problems and it is almost never used in CNNs:

Saturated neurons kill the gradient

If you observe the above graph carefully, if the input is beyond -5 or 5, the output will be very close to 0 and 1 respectively. Also, in this region the gradients are almost zero. Notice that the tangents in this region will be almost parallel to x-axis thus ~0 slope.

As we know that gradients get multiplied in back-propogation, so this small gradient will virtually stop back-propogation into further layers, thus killing the gradient.

Outputs are not zero-centered

As you can see that all the outputs are between 0 and 1. As these become inputs to the next layer, all the gradients of the next layer will be either positive or negative. So the path to optimum will be zig-zag. I will skip the mathematics here. Please refer the stanford class referred above for details.

Taking the exp() is computationally expensive

Though not a big drawback, it has a slight negative impact

tanh activation

It is always preferred over sigmoid because it solved problem #2, i.e. the outputs are in range (-1,1).

But it will still result in killing the gradient and thus not recommended choice.

 ReLU (Rectified Linear Unit)

Gradient won’t saturate in the positive region

Computationally very efficient as simple thresholding is required

Empirically found to converge faster than sigmoid or tanh.

Output is not zero-centered and always positive

Gradient is killed for x<0. Few techniques like leaky ReLU and parametric ReLU are used to overcome this and I encourage you to find these

Gradient is not defined at x=0. But this can be easily catered using sub-gradients and posts less practical challenges as x=0 is generally a rare case

To summarize, ReLU is mostly the activation function of choice. If the caveats are kept in mind, these can be used very efficiently.

Data Preprocessing

For images, generally the following preprocessing steps are done:

Same Size Images: All images are converted to the same size and generally in square shape.

Mean Centering: For each pixel, its mean value among all images can be subtracted from each pixel. Sometimes (but rarely) mean centering along red, green and blue channels can also be done

Note that normalization is generally not done in images.

Weight Initialization

There can be various techniques for initializing weights. Lets consider a few of them:

All zeros

This is generally a bad idea because in this case all the neuron will generate the same output initially and similar gradients would flow back in back-propagation

The results are generally undesirable as network won’t train properly.

Gaussian Random Variables

The weights can be initialized with random gaussian distribution of 0 mean and small standard deviation (0.1 to 1e-5)

This works for shallow networks, i.e. ~5 hidden layers but not for deep networks

In case of deep networks, the small weights make the outputs small and as you move towards the end, the values become even smaller. Thus the gradients will also become small resulting in gradient killing at the end.

Note that you need to play with the standard deviation of the gaussian distribution which works well for your network.

Xavier Initialization

It suggests that variance of the gaussian distribution of weights for each neuron should depend on the number of inputs to the layer.

The recommended variance is square root of inputs. So the numpy code for initializing the weights of layer with n inputs is: np.random.randn(n_in, n_out)*sqrt(1/n_in)

A recent research suggested that for ReLU neurons, the recommended update is: np.random.randn(n_in, n_out)*sqrt(2/n_in). Read this blog post for more details.

One more thing must be remembered while using ReLU as activation function. It is that the weights initialization might be such that some of the neurons might not get activated because of negative input. This is something that should be checked. You might be surprised to know that 10-20% of the ReLUs might be dead at a particular time while training and even in the end.

These were just some of the concepts I discussed here. Some more concepts can be of importance like batch normalization, stochastic gradient descent, dropouts which I encourage you to read on your own.

4. Introduction to Convolution Neural Networks

Before going into the details, lets first try to get some intuition into why deep networks work better.

As we learned from the drawbacks of earlier approaches, they are unable to cater to the vast amount of variations in images. Deep CNNs work by consecutively modeling small pieces of information and combining them deeper in network.

One way to understand them is that the first layer will try to detect edges and form templates for edge detection. Then subsequent layers will try to combine them into simpler shapes and eventually into templates of different object positions, illumination, scales, etc. The final layers will match an input image with all the templates and the final prediction is like a weighted sum of all of them. So, deep CNNs are able to model complex variations and behaviour giving highly accurate predictions.

There is an interesting paper on visualization of deep features in CNNs which you can go through to get more intuition – Understanding Neural Networks Through Deep Visualization.

For the purpose of explaining CNNs and finally showing an example, I will be using the CIFAR-10 dataset for explanation here and you can download the data set from here. This dataset has 60,000 images with 10 labels and 6,000 images of each type. Each image is colored and 32×32 in size.

A CNN typically consists of 3 types of layers:

Convolution Layer

Pooling Layer

Fully Connected Layer

You might find some batch normalization layers in some old CNNs but they are not used these days. We’ll consider these one by one.

Convolution Layer

Since convolution layers form the crux of the network, I’ll consider them first. Each layer can be visualized in the form of a block or a cuboid. For instance in the case of CIFAR-10 data, the input layer would have the following form:

Here you can see, this is the original image which is 32×32 in height and width. The depth here is 3 which corresponds to the Red, Green and Blue colors, which form the basis of colored images. Now a convolution layer is formed by running a filter over it. A filter is another block or cuboid of smaller height and width but same depth which is swept over this base block. Let’s consider a filter of size 5x5x3.

We start this filter from the top left corner and sweep it till the bottom left corner. This filter is nothing but a set of eights, i.e. 5x5x3=75 + 1 bias = 76 weights. At each position, the weighted sum of the pixels is calculated as WTX + b and a new value is obtained. A single filter will result in a volume of size 28x28x1 as shown above.

Note that multiple filters are generally run at each step. Therefore, if 10 filters are used, the output would look like:

Here the filter weights are parameters which are learned during the back-propagation step. You might have noticed that we got a 28×28 block as output when the input was 32×32. Why so? Let’s look at a simpler case.

Suppose the initial image had size 6x6xd and the filter has size 3x3xd. Here I’ve kept the depth as d because it can be anything and it’s immaterial as it remains the same in both. Since depth is same, we can have a look at the front view of how filter would work:

Here we can see that the result would be 4x4x1 volume block. Notice there is a single output for entire depth of the each location of filter. But you need not do this visualization all the time. Let’s define a generic case where image has dimension NxNxd and filter has FxFxd. Also, lets define another term stride (S) here which is the number of cells (in above matrix) to move in each step. In the above case, we had a stride of 1 but it can be a higher value as well. So the size of the output will be:

output size = (N – F)/S + 1

You can validate the first case where N=32, F=5, S=1. The output had 28 pixels which is what we get from this formula as well. Please note that some S values might result in non-integer result and we generally don’t use such values.

Let’s consider an example to consolidate our understanding. Starting with the same image as before of size 32×32, we need to apply 2 filters consecutively, first 10 filters of size 7, stride 1 and next 6 filters of size 5, stride 2. Before looking at the solution below, just think about 2 things:

What should be the depth of each filter?

What will the resulting size of the images in each step.

Here is the answer:

Notice here that the size of the images is getting shrunk consecutively. This will be undesirable in case of deep networks where the size would become very small too early. Also, it would restrict the use of large size filters as they would result in faster size reduction.

To prevent this, we generally use a stride of 1 along with zero-padding of size (F-1)/2. Zero-padding is nothing but adding additional zero-value pixels towards the border of the image.

Consider the example we saw above with 6×6 image and 3×3 filter. The required padding is (3-1)/2=1. We can visualize the padding as:

Here you can see that the image now becomes 8×8 because of padding of 1 on each side. So now the output will be of size 6×6 same as the original image.

Now let’s summarize a convolution layer as following:

Input size: W1 x H1 x D1


K: #filters

F: filter size (FxF)

S: stride

P: amount of padding

Output size: W2 x H2 x D2




#parameters = (F.F.D).K + K

F.F.D : Number of parameters for each filter (analogous to volume of the cuboid)

(F.F.D).K : Volume of each filter multiplied by the number of filters

+K: adding K parameters for the bias term

Some additional points to be taken into consideration:

K should be set as powers of 2 for computational efficiency

F is generally taken as odd number

F=1 might sometimes be used and it makes sense because there is a depth component involved

Filters might be called kernels sometimes

Having understood the convolution layer, lets move on to pooling layer.

Pooling Layer

When we use padding in convolution layer, the image size remains same. So, pooling layers are used to reduce the size of image. They work by sampling in each layer using filters. Consider the following 4×4 layer. So if we use a 2×2 filter with stride 2 and max-pooling, we get the following response:

Here you can see that 4 2×2 matrix are combined into 1 and their maximum value is taken. Generally, max-pooling is used but other options like average pooling can be considered.

Fully Connected Layer

At the end of convolution and pooling layers, networks generally use fully-connected layers in which each pixel is considered as a separate neuron just like a regular neural network. The last fully-connected layer will contain as many neurons as the number of classes to be predicted. For instance, in CIFAR-10 case, the last fully-connected layer will have 10 neurons.

5. Case Study: AlexNet

I recommend reading the prior section multiple times and getting a hang of the concepts before moving forward.

In this section, I will discuss the AlexNet architecture in detail. To give you some background, AlexNet is the winning solution of IMAGENET Challenge 2012. This is one of the most reputed computer vision challenge and 2012 was the first time that a deep learning network was used for solving this problem.

Also, this resulted in a significantly better result as compared to previous solutions. I will share the network architecture here and review all the concepts learned above.

The detailed solution has been explained in this paper. I will explain the overall architecture of the network here. The AlexNet consists of a 11 layer CNN with the following architecture:

Here you can see 11 layers between input and output. Lets discuss each one of them individually. Note that the output of each layer will be the input of next layer. So you should keep that in mind.

Layer 0: Input image

Size: 227 x 227 x 3

Note that in the paper referenced above, the network diagram has 224x224x3 printed which appears to be a typo.

Layer 1: Convolution with 96 filters, size 11×11, stride 4, padding 0

Size: 55 x 55 x 96

(227-11)/4 + 1 = 55 is the size of the outcome

96 depth because 1 set denotes 1 filter and there are 96 filters

Layer 2: Max-Pooling with 3×3 filter, stride 2

Size: 27 x 27 x 96

(55 – 3)/2 + 1 = 27 is size of outcome

depth is same as before, i.e. 96 because pooling is done independently on each layer

Layer 3: Convolution with 256 filters, size 5×5, stride 1, padding 2

Size: 27 x 27 x 256

Because of padding of (5-1)/2=2, the original size is restored

256 depth because of 256 filters

Layer 4: Max-Pooling with 3×3 filter, stride 2

Size: 13 x 13 x 256

(27 – 3)/2 + 1 = 13 is size of outcome

Depth is same as before, i.e. 256 because pooling is done independently on each layer

Layer 5: Convolution with 384 filters, size 3×3, stride 1, padding 1

Size: 13 x 13 x 384

Because of padding of (3-1)/2=1, the original size is restored

384 depth because of 384 filters

Layer 6: Convolution with 384 filters, size 3×3, stride 1, padding 1

Size: 13 x 13 x 384

Because of padding of (3-1)/2=1, the original size is restored

384 depth because of 384 filters

Layer 7: Convolution with 256 filters, size 3×3, stride 1, padding 1

Size: 13 x 13 x 256

Because of padding of (3-1)/2=1, the original size is restored

256 depth because of 256 filters

Layer 8: Max-Pooling with 3×3 filter, stride 2

Size: 6 x 6 x 256

(13 – 3)/2 + 1 = 6 is size of outcome

Depth is same as before, i.e. 256 because pooling is done independently on each layer

Layer 9: Fully Connected with 4096 neuron

In this later, each of the 6x6x256=9216 pixels are fed into each of the 4096 neurons and weights determined by back-propagation.

Layer 10: Fully Connected with 4096 neuron

Similar to layer #9

Layer 11: Fully Connected with 1000 neurons

This is the last layer and has 1000 neurons because IMAGENET data has 1000 classes to be predicted.

I understand this is a complicated structure but once you understand the layers, it’ll give you a much better understanding of the architecture. Note that you fill find a different representation of the structure if you look at the AlexNet paper. This is because at that GPUs were not very powerful and they used 2 GPUs for training the network. So the work processing was divided between the two.

ZFNet: winner of 2013 challenge

GoogleNet: winner of 2014 challenge

VGGNet: a good solution from 2014 challenge

ResNet: winner of 2023 challenge designed by Microsoft Research Team

This video gives a brief overview and comparison of these solutions towards the end.

6. Implementing CNNs using GraphLab

Having understood the theoretical concepts, lets move on to the fun part (practical) and make a basic CNN on the CIFAR-10 dataset which we’ve downloaded before.

I’ll be using GraphLab for the purpose of running algorithms. Instead of GraphLab, you are free to use alternatives tools such as Torch, Theano, Keras, Caffe, TensorFlow, etc. But GraphLab allows a quick and dirty implementation as it takes care of the weights initializations and network architecture on its own.

We’ll work on the CIFAR-10 dataset which you can download from here. The first step is to load the data. This data is packed in a specific format which can be loaded using the following code:

import pandas as pd import numpy as np import cPickle #Define a function to load each batch as dictionary: def unpickle(file): fo = open(file, 'rb') dict = cPickle.load(fo) fo.close() return dict #Make dictionaries by calling the above function: batch1 = unpickle('data/data_batch_1') batch2 = unpickle('data/data_batch_2') batch3 = unpickle('data/data_batch_3') batch4 = unpickle('data/data_batch_4') batch5 = unpickle('data/data_batch_5') batch_test = unpickle('data/test_batch') #Define a function to convert this dictionary into dataframe with image pixel array and labels: def get_dataframe(batch): df = pd.DataFrame(batch['data']) df['image'] = df.as_matrix().tolist() df.drop(range(3072),axis=1,inplace=True) df['label'] = batch['labels'] return df #Define train and test files: train = pd.concat([get_dataframe(batch1),get_dataframe(batch2),get_dataframe(batch3),get_dataframe(batch4),get_dataframe(batch5)],ignore_index=True) test = get_dataframe(batch_test)

We can verify this data by looking at the head and shape of data as follow:

print train.head()

print train.shape, test.shape

Since we’ll be using graphlab, the next step is to convert this into a graphlab SFrame and run neural network. Let’s convert the data first:

import graphlab as gl gltrain = gl.SFrame(train) gltest = gl.SFrame(test) model = gl.neuralnet_classifier.create(gltrain, target='label', validation_set=None)

Here it used a simple fully connected network with 2 hidden layers and 10 neurons each. Let’s evaluate this model on test data.


As you can see that we have a pretty low accuracy of ~15%. This is because it is a very fundamental network. Lets try to make a CNN now. But if we go about training a deep CNN from scratch, we will face the following challenges:

The available data is very less to capture all the required features

Training deep CNNs generally requires a GPU as a CPU is not powerful enough to perform the required calculations. Thus we won’t be able to run it on our system. We can probably rent an Amazom AWS instance.

To overcome these challenges, we can use pre-trained networks. These are nothing but networks like AlexNet which are pre-trained on many images and the weights for deep layers have been determined. The only challenge is to find a pre-trianed network which has been trained on images similar to the one we want to train. If the pre-trained network is not made on images of similar domain, then the features will not exactly make sense and classifier will not be of higher accuracy.

Before proceeding further, we need to convert these images into the size used in ImageNet which we’re using for classification. The GraphLab model is based on 256×256 size images. So we need to convert our images to that size. Lets do it using the following code:

#Convert pixels to graphlab image format gltrain['glimage'] = gl.SArray(gltrain['image']).pixel_array_to_image(32, 32, 3, allow_rounding = True) gltest['glimage'] = gl.SArray(gltest['image']).pixel_array_to_image(32, 32, 3, allow_rounding = True) #Remove the original column gltrain.remove_column('image') gltest.remove_column('image') gltrain.head()

Here we can see that a new column of type graphlab image has been created but the images are in 32×32 size. So we convert them to 256×256 using following code:

#Convert into 256x256 size gltrain['image'] = gl.image_analysis.resize(gltrain['glimage'], 256, 256, 3) gltest['image'] = gl.image_analysis.resize(gltest['glimage'], 256, 256, 3) #Remove old column: gltrain.remove_column('glimage') gltest.remove_column('glimage') gltrain.head()

Now we can see that the image has been converted into the desired size. Next, we will load the ImageNet pre-trained model in graphlab and use the features created in its last layer into a simple classifier and make predictions.

Lets start by loading the pre-trained model.

#Load the pre-trained model:

Now we have to use this model and extract features which will be passed into a classifier. Note that the following operations may take a lot of computing time. I use a Macbook Pro 15″ and I had to leave it for whole night!

gltrain['features'] = pretrained_model.extract_features(gltrain) gltest['features'] = pretrained_model.extract_features(gltest)

Lets have a look at the data to make sure we have the features:


Though, we have the features with us, notice here that lot of them are zeros. You can understand this as a result of smaller data set. ImageNet was created on 1.2Mn images. So there would be many features in those images that don’t make sense for this data, thus resulting in zero outcome.

simple_classifier = graphlab.classifier.create(gltrain, features = ['features'], target = 'label')

The various outputs are:

The final model selection is based on a validation set with 5% of the data. The results are:

So we can see that Boosted Trees Classifier has been chosen as the final model. Let’s look at the results on test data:


So we can see that the test accuracy is now ~50%. It’s a decent jump from 15% to 50% but there is still huge potential to do better. The idea here was to get you started and I will skip the next steps. Here are some things which you can try:

Remove the redundant features in the data

Perform hyper-parameter tuning in models

Search for pre-trained models which are trained on images similar to this dataset


Now, its time to take the plunge and actually play with some other real datasets. So are you ready to take on the challenge? Accelerate your deep learning journey with the following Practice Problems:

End Notes

In this article, we covered the basics of computer vision using deep Convolution Neural Networks (CNNs). We started by appreciating the challenges involved in designing artificial systems which mimic the eye. Then, we looked at some of the traditional techniques, prior to deep learning, and got some intuition into their drawbacks.

We moved on to understanding the some aspects of tuning a neural networks such as activation functions, weights initialization and data-preprocessing. Next, we got some intuition into why deep CNNs should work better than traditional approaches and we understood the different elements present in a general deep CNN.

Subsequently, we consolidated our understanding by analyzing the architecture of AlexNet, the winning solution of ImageNet 2012 challenge. Finally, we took the CIFAR-10 data and implemented a CNN on it using a pre-trained AlexNet deep network.

You can test your skills and knowledge. Check out Live Competitions and compete with best Data Scientists from all over the world.


Dynamic Character To Web Applications

Introduction to AngularJS Application

Web development, programming languages, Software testing & others

Angular JS used HTML language to extend its syntax and helps in creating applications more efficiently. Angular JS is used to make it dynamic as HTML is mainly used as a static language. Angular JS follows the concept of MVC (Model View Controller). The main idea behind MVC is to make a differentiation between data, logic, and view layer. The view receives data from the model, which is used to display to the user. When the user interacts with the application by performing actions then the controller has changed the data in the model and after that view displays the information after it tells the model about the changes. In Angular JS, data is stored in properties of an object, controllers are JS classes and the view is DOM (Document Object Model).

Concepts of AngularJS Application

The concepts of AngularJS Application with their examples are as follows:

1. Directives to extend HTML attributes 2. Scope

It is used for the communication between controller and view. It binds the view to the view model and functions defined in the controller Angular JS supports nested or hierarchical scopes. It is the data source for AngularJS and it can add or remove properties when required. All the data manipulation and assignment of data happens through scope objects when doing CRUD operations.

3. Controllers

These are used to define the scope for the views and scope can be thought of as variables and functions that view may use some binding.

First Name: Last Name: Full Name: {{firstName + ” ” + lastName}} var app = angular.module(‘myApp’, []); app.controller(‘myCtrl’, function ($scope) { $scope.firstName = ”James”; $scope.lastName = ”Anderson”; });

4. Data Binding

Example: When the user types into the text box the changed value shows in upper and lower case in the label that is two-way data binding.

5. Services 6. Routing

t helps in dividing the app into multiple views and bind multiple views to controllers. It divides SPA into multiple views to logically divide the app and make it more manageable.

default route. App.config(['$routeProvider', function($routeProvider) { $routeProvider. when('/List', { templateUrl: 'Views/list.html', controller: 'ListController' }). when('/Add', { templateUrl: 'Views/add.html', controller: 'AddController' ). otherwise({ redirectTo: '/List' }); }]) 7. Filters

These are used to extend the behavior of binding expression and directive. It allows formatting the data and formatting values or applying certain conditions. Filters are invoked in HTML with a pipe inside expressions.

var app = angular.module(‘myApp’, []); app.controller(“namesCtrl”, function ($scope) { $scope.friends = [ { name: ”Karl”, age: 27, city: ”Bangalore” }, { name: ”Lewis”, age: 55, city: ”Newyork” }, ]; });

8. Expressions 9. Modules

The module is the container of an application and application controllers belong to a module. It is a collection of functions and divides applications into small and reusable functional components. The module can be identified by a unique name and can be dependent on other modules.

{{ firstName + ” ” + lastName }}

10. Testing

To test angular JS code, test frameworks are widely used like Jasmine and karma. These testing frameworks mainly support mocking and are highly configurable using JSON files with help of various plugins.


Angular JS provides the framework to develop the web application in very less time and efficiently. Angular JS is always available for unit testing. It is mainly used for SPA, which makes the development faster. It is easy to understand and simple to learn for JavaScript developers. Angular JS is still useful for people who are beginners as they can grasp it easily.

Angular is getting pace for front-end development as it makes the development faster. Large applications can be easily handled in angular. It can execute better with components. Angular is having really strong areas and significant features to use. Angular has released its higher versions also with new features and better performance.

Recommended Articles

We hope that this EDUCBA information on “AngularJS Application” was beneficial to you. You can view EDUCBA’s recommended articles for more information.

The Best Social Networks For Private People

Social networking and privacy do not go hand-in-hand. After all, the key to a good social networking experience is sharing, and the key to good sharing is…lack of discrimination.

But what if you’re not a social butterfly, a broadcaster, or someone with a deep desire to be Internet famous? What if you want to use social media to share photos, videos, and status updates with your family and close friends—but not with the entire world? The good news is that you can still use social networks, even major ones such as Facebook and Twitter. You just have to be careful. And if you’d rather not wrestle with Facebook’s privacy settings, you can check out some ultra-exclusive social networks that really value your privacy.


If you read the news at all, you probably think that Facebook is antiprivacy. Critics say the social network has complicated privacy settings and that CEO Mark Zuckerberg has a lax view of privacy in general. But if you’re a private person who wants to share with friends and family, Facebook is the best major social network for you. Facebook operates on the friend-request model, which means that prospective friends must receive your approval one by one (unlike Twitter followers) before entering your neighborhood of Facebook.

Facebook offers shortcuts to lock down your account

Facebook’s privacy and security settings are complex, to say the least, and you can spend hours tweaking and perfecting them. But if you’re strapped for time, you should pay particular attention to a few key settings.

Use Facebook’s ‘View As’ feature to see how other people see your profile


Private people who want to share selectively with a tight-knit group of friends and family should probably just stay away from Twitter. Twitter is a great social network for public figures (and people who want to be public figures), because it essentially functions as a broadcasting platform. But if you’re looking to make or keep relationships, it’s not the most suitable network for your needs.

Twitter’s privacy settings are simple to set up.

There’s no way to limit your past tweets (public tweets always remain public)—and if you unprotect your account at some point, all of the previously protected tweets will become public, and will stay public forever.

Ultraprivate social networks

If you feel that Facebook and Twitter are too public, you may want to take a look at private social networks. The following social networks are designed for close-knit groups who really want to connect with each other—not social butterflies who want to broadcast their lives across the Internet.

Couple is a social network for pairs.

Couple: Formerly known as Pair, Couple is the ultimate private social network—a smartphone-based network designed expressly for couples. In fact, you can only have one friend on Couple: your significant other. Couple features a timeline that’s a bit like a souped-up text message exchange—you and your partner can add photos, reminders, important dates, drawings, and videos, along with regular text messages.

Family Wall: If you’re looking for a slightly larger social network, FamilyWall helps you keep track of your entire family. At this private, Facebook-like social network for families, you can add dates and events, photos, videos, contacts, messages, and even Foursquare-style check-ins. You can also add “Family landmarks” such as schools, doctors, and fitness centers.

23snaps: Instead of posting photos of your children on Facebook or Instagram, try posting them to 23snaps, a smartphone-based social network that lets you create a unique, private online photostream. 23snaps lets you add photos, videos, and status updates to a special photostream of your child (you can add a stream for each child) and then share those photos with your friends and family. Another option is to co-manage a 23snaps account with your partner, so you can both add photos of your kids.

23snaps is a private, photo-sharing network for families.

Path: Perhaps the best-known private social network is Path. This smartphone-based social network limits your friends list to 150—the maximum number of friends a human being can realistically keep track of, according to studies. By virtue of being small, Path is one of the more private social networks you can join. But you’ll have to choose your friends wisely. Path may not also be as private as it once was. Users this week complained that a 2-month-old feature of the Path app that lets you invite contacts to join the network is actually spamming their address books with mass texts. Path says the texts are the result of user error.

Nextdoor: If you want to restrict your social network communication to people you know in real life, the neighborhood social network Nextdoor might be right for you. Nextdoor requires all members to verify their address (the service sends you a physical postcard with a code on it) before allowing them to join their neighborhood’s group. As a result of this structure, the only people you can talk to on Nextdoor are those who live within shouting distance of your house. Nextdoor turns your physical neighborhood into a digital network.

Privacy…the choice is yours

Privacy-minded people don’t have to give up social networking. Plenty of options exist for friends, families, and even couples who want to communicate privately. But the key is to make sure that you really want privacy. Some portion of the appeal of social networking to most people is exhibitionist; so before you go to ground, make absolutely sure that you don’t harbor any latent fantasies of seeing your videos go viral?

Integrate Mailchimp Using Ai With 1000+ Applications

Email marketing is one of the oldest ways to promote and market any brand. With time, it has evolved from sending a plain text to engaging graphics. It is not only about sending emails to thousands of people and waiting them to respond. It is about owning a quality time in customers’ inbox.

In this blog, let us discuss major MailChimp integrations and how they can benefit your business.

Major MailChimp Integrations You Need to Know

You might be curious to know- if MailChimp is efficient enough to manage email marketing campaigns alone, then why in the first place we need to integrate it with other software!

In this technology-driven era, if you integrate two business-related software, and set up a trigger with some expected action (with the help of third-party software), then several workflows can be automated.

Similarly, if you integrate MailChimp with various useful software like Google Sheets, Salesforce, Slack, etc., then you can save your time and cost spent in accomplishing routine tasks. Let us explore some of the major MailChimp integrations that can truly help you in automating marketing campaigns and other relatable tasks.

MailChimp Salesforce Integration

Salesforce is the cloud-based Customer Resource Management (CRM) software that helps in understanding customers and maintaining long-lasting relations with them. It helps in assuring that every customer interaction will contribute to the revenues of your business.

For instance, you are using MailChimp for your latest email campaign. Your marketing team is rigorously sending emails to a targeted audience and at the same time informing sales time about day-to-day status. Sometimes, it consumes lot of time to update status. For better coordination among the marketing and sales team and automating the data syncing between them, you can integrate Salesforce with MailChimp.

Connect MailChimp With Salesforce

MailChimp HubSpot Integration

Integrating MailChimp with HubSpot can help your business in automatically syncing data for better customer conversion rates, categorizing the targeted audience, and handling errors in databases. This integration will help in keeping track of every new user connected through email campaign and adding every new contact to an email list of a campaign that is added to HubSpot CRM.

Connect MailChimp With HubSpot

MailChimp Facebook Integration

Integrating MailChimp with Facebook can help you in driving the major traffic for the growth of your business. You can automatically inform your followers and potential customers on Facebook about the new email campaigns you are going to start with MailChimp.

For instance, you are going to launch a new product of your brand. You can announce the grand launch on your Facebook Page with the help of attractive videos, eye-catchy visuals, and live sessions. This could help you attract millions of new people and update the old ones. By integrating Facebook Page with MailChimp, you can automatically inform people about anything related to your brand with help of emails and ask them to follow you on Facebook.

Connect MailChimp With Facebook

MailChimp Gmail Integration

Gmail is the most popular free web-based email service offered by Google. It has a crisp, clean, and clear interface that allows every individual to easily send and receive emails and further categorize them into different categories.

Integrating MailChimp with Gmail can help your business give a personalized touch to all the people mentioned in your mailing list. After setting up the integration, you can automate sending emails to the personal accounts of your subscribers for giving them regular updates. Further, it would automatically update your mailing once you label emails in Gmail.

Connect MailChimp with Gmail

MailChimp PayPal Integration

Paypal is one of the leading online payment gateways that allow individuals and businesses to send and receive money worldwide. You can easily pay for your cab services, online shopping, and much more in more than 100 currencies with PayPal.

Integrating MailChimp with PayPal can help your business automate the process of manually entering the details of every customer in the mailing list used by MailChimp. Every time new users get connected with your website and complete the payment process; the relevant details get automatically added to the mailing list for further communication.

Connect MailChimp with PayPal

Summing Up

In this technology-driven era, you cannot ignore the importance of workflow automation in your business. You need to automate various workflows for increasing the efficiency and productivity of your business.

You can easily automate routine workflows, just by integrating two relevant software. For integrating and automation, we recommend you the leading software- Appy Pie Connect. You can easily connect over hundreds of software with Appy Pie Connect. You need to code a single line for workflow automation.

Try Appy Pie Connect Now!

Food Delivery Time Prediction With Lstm Neural Network


More businesses are moving online these days, and consumers are ordering online instead of traveling to the store to buy. Zomato and Swiggy are popular online platforms for ordering food products. Other examples are Uber Eats, Food Panda, and Deliveroo, which also have similar services. They provide food delivery options. If the order is complete, a partner will pick up and deliver the meal to the given address via a delivery service. In online food-ordering businesses, delivery time is critical. As a result, estimated food delivery time prediction to reach the buyer’s location is critical. The LSTM neural network is one of the methods that may be implemented in this circumstance. Come on, let’s study the LSTM models in detail.

This article was published as a part of the Data Science Blogathon.

Table of Contents Objectives of Food Delivery Time Prediction

Make an accurate estimate of when the food will arrive, thus increasing customer confidence.

Plan delivery routes and driver schedules more efficiently by predicting how many orders will arrive so that delivery providers can use their resources better.

Make deliveries faster by looking at past delivery data and determining the attributes that affect them.

Grow business because of buyer satisfaction with the speed of delivery.

Based on these goals, we will use the LSTM Neural Network to develop a model that can estimate the delivery time of orders accurately based on the age of the delivery partner, the partner’s rating, and the distance between the restaurant and the buyer’s place. This article will guide you on predicting food delivery time using LSTM. Now, let’s make the prediction through the steps in the article.

Step 1: Import Library import pandas as pd import numpy as np import as px from sklearn.model_selection import train_test_split from keras.models import Sequential from keras.layers import Dense, LSTM

Pandas and NumPy libraries are used together for data analysis. NumPy provides fast mathematical functions for multidimensional arrays, while Pandas makes it easier to analyze and manipulate data with more complex data structures like DataFrame and Series. Meanwhile, the Plotly Express library makes it easy for users to create interactive visualizations in Python. It can use minimal code to create various charts, such as scatter plots, line charts, bar charts, and maps. The Sequential class is a type of model in Keras that allows users to create a neural network by adding layers to it in sequential order. Then, Dense and LSTM are to create layers in the Keras model and also customize their configurations.

Step 2: Read the Data

variables for the particular task at hand. And for this particular case, the appropriate dataset is on my github. The dataset given here is a cleaned version of the original dataset submitted by Gaurav Malik on Kaggle.

#reading dataset data = pd.read_csv(url) data.sample(5)

Let’s see detailed information about the dataset we use with the info() command.

dataset overview

Checking a dataset’s columns and null values is essential in any data analysis project. Let’s do it.


The dataset is complete with no null values, so let’s proceed!

Step 3: Haversine Formula

The Haversine formula is used to find the distance between two geographical locations. The formula refers to this Wikipedia page as follows:

It takes the latitude and longitude of two points and converts the angles to radians to perform the necessary calculations. We use this formula because the dataset doesn’t provide the distance between the restaurant and the delivery location. There are only latitude and longitude. So, let’s calculate it and then create a distance column in the dataset.

R = 6371 ##The earth's radius (in km) def deg_to_rad(degrees): return degrees * (np.pi/180) ## The haversine formula def distcalculate(lat1, lon1, lat2, lon2): d_lat = deg_to_rad(lat2-lat1) d_lon = deg_to_rad(lon2-lon1) a1 = np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) a2 = np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2 a = a1 * a2 c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a)) return R * c # Create distance column & calculate the distance data['distance'] = np.nan for i in range(len(data)): data.loc[i, 'distance'] = distcalculate(data.loc[i, 'Restaurant_latitude'], data.loc[i, 'Restaurant_longitude'], data.loc[i, 'Delivery_location_latitude'], data.loc[i, 'Delivery_location_longitude'])

The parameter “lat” means latitude, and “lon” means longitude. The deg_to_rad function is helpful for converting degrees to radians. At the same time, calculate the distance between two location points using the variables a1 and a2. The variable stores the result of multiplying a1 and a2, while the c variable stores the result of the Haversine formula calculation, which produces the distance between the two location points.

We have added a distance column to the dataset. Now, we will analyze the effect of distance and delivery time.

figure = px.scatter(data_frame = data, x="distance", y="Time_taken(min)", size="Time_taken(min)", trendline="ols", title = "Relationship Between Time Taken and Distance")

The graph shows that there is a consistent relationship between the time taken and the distance traveled for food delivery. This means that the majority of delivery partners deliver food within a range of 25–30 minutes, regardless of the distance.

Next, we will explore whether the delivery partner’s age affects delivery time or not.

figure = px.scatter(data_frame = data, x="Delivery_person_Age", y="Time_taken(min)", size="Time_taken(min)", color = "distance", trendline="ols", title = "Relationship Between Delivery Partner Age and Time Taken")

The graph shows faster food delivery when partners are younger than their older counterparts. Now let’s explore the correlation between delivery time and delivery partner ratings.

figure = px.scatter(data_frame = data, x="Delivery_person_Ratings", y="Time_taken(min)", size="Time_taken(min)", color = "distance", trendline="ols", title = "Relationship Between Delivery Partner Ratings and Time Taken")

The graph shows an inverse linear relationship. The higher the rating partner, the faster the time needed to deliver food, and vice versa.

The next step will be to see whether the delivery partner’s vehicle affects the delivery time or not.

fig =, x="Type_of_vehicle", y="Time_taken(min)", color="Type_of_order", title = "Relationship Between Type of Vehicle and Type of Order")

The graph shows that the type of delivery partner’s vehicle and the type of food delivered do not significantly affect delivery time.

Through the analysis above, we can determine that the delivery partner’s age, the delivery partner’s rating, and the distance between the restaurant and the delivery location are the features that have the most significant impact on food delivery time.

Step 4: Build an LSTM Model and Make Predictions

Previously, we have determined three features that significantly affect the time taken, namely the delivery partner’s age, the delivery partner’s rating, and distance. So the three features will become independent variables (x), while the time taken will become the dependent variable (y).

x = np.array(data[["Delivery_person_Age", "Delivery_person_Ratings", "distance"]]) y = np.array(data[["Time_taken(min)"]]) xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.20, random_state=33)

Now, we need to train an LSTM neural network to predict food delivery time. The aim is to create a precise model that uses features like distance, delivery partner age, and rating to estimate food delivery time. The trained model can then be used to predict new data points or unseen scenarios.

model = Sequential() model.add(LSTM(128, return_sequences=True, input_shape= (xtrain.shape[1], 1))) model.add(LSTM(64, return_sequences=False)) model.add(Dense(25)) model.add(Dense(1)) model.summary()

The code block above explains:

The first line starts building the model architecture by creating an instance of the Sequential class. The following three lines define the layers of the model. The first layer is an LSTM layer with 128 units, which returns sequences and takes input for shape (xtrain.shape[1], 1). Here, xtrain is the input training data, and shape[1] represents the number of features in the input data. The return_sequences parameter is set to True because there will be more layers after this one. The second layer is also an LSTM layer, but with 64 units and return_sequences set to False, indicating that this is the last layer. The third line adds a dense layer with 25 units, which reduces the output of the LSTM layers to a more manageable size. Finally, the fourth line adds a dense layer with one unit, which is the output layer of the model.

Now let’s train the previously created model., ytrain, batch_size=1, epochs=9)

The ‘adam’ parameter is a popular optimization algorithm for deep learning models, and the ‘mean_squared_error’ parameter is a common loss function used in regression problems. The parameter batch_size = 1 means that the model will update its weights after each sample is processed during training. The epochs parameter is set to 9, meaning the model will be trained on the entire dataset for nine iterations.

Finally, let’s test the model’s performance for predicting food delivery times given three input parameters (delivery partner age, delivery rating, and distance).

print("Food Delivery Time Prediction using LSTM") a = int(input("Delivery Partner Age: ")) b = float(input("Previous Delivery Ratings: ")) c = int(input("Total Distance: ")) features = np.array([[a, b, c]]) print("Delivery Time Prediction in Minutes = ", model.predict(features))

The given result is a prediction of the delivery time for a hypothetical food delivery order based on the trained LSTM neural network model using the following input features:

Delivery Partner’s Age: 33

Previous Delivery Ratings: 4.0

Total distance: 7

The output of the prediction is shown as “Delivery Time Prediction in Minutes = [[36.913715]],” which means that the model has estimated that the food delivery will take approximately 36.91 minutes to reach the destination.


This article starts by calculating the distance between the restaurant and the delivery location. Then, it analyzes previous delivery times for the same distance before predicting food delivery times in real-time using LSTM. Broadly speaking, in this post, we have discussed the following:

How to calculate the distance using the haversine formula?

How to find the features that affect the food delivery time prediction?

How to use LSTM neural network model to predict the food delivery time?

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.


Update the detailed information about Applications Of Convolutional Neural Networks(Cnn) on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!