Considerations in Time Series – Part I

Considerations in Time Series – Part I

In this article, I will introduce the topic of time series forecasting and provide a high-level overview of the concepts and practices used by forecasters when building time series models.

What is Time Series Data?

Time series data is simply any time-ordered dataset. The time component serves as the primary axis and the remaining data can be either univariate or multivariate (single-stream or multi-stream). Typically when we think of time series we think of equally spaced, discrete measurements taken successively over time. But there are cases where the time intervals would be inconsistent.

For example, when sampling from sensor data the records can be either time-driven or event-driven. A time-driven record is where a measurement or reading is taken at evenly-spaced time intervals. An event-driven record is (just as the name implies) when a measurement or reading is triggered by some event. In an event-driven sampling scenario we could expect to see inconsistent time intervals in the data.  

Time series data is the fastest growing data category due to the proliferation of sensors, IoT devices, and mobile technologies. And the granularity is often high resolution. Sometimes, as in the case of autonomous vehicles, new information being generated by the millisecond.

Time series are typically plotted as line charts (or run sequence plots) with time on the x-axis. Below is an example of a time series plot of financial trade activity measured every 5 minutes.

Typical Time Series Run Sequence Plot Generated From an AWS QuickSight Dashboard

The Fallacies of Forecasting

Forecasting has been around a long time. We have been trying to predict the future since ancient times by enlisting the help of prophets, oracles, and soothsayers. Their predictions came at a high cost and were ambiguous and unreliable at best. But now, thanks to mathematics and statistics, we have a way to gain insights about the future without the high cost of a drug-induced maiden.

Pythia – The Oracle at Delphi (The Early Days of Forecasting)

Today, all we need is data, a statistician, and a computer to be able to make inferences about the future. This is true if (and only if) you have the right data and you are asking the right question. In reality, forecasting can be incredibly complex and difficult even for the most sophisticated statistician. It all depends on the nature and behavior of the data and what you are trying to achieve with it.

Generally speaking, reliable and well-behaved data produces reliable and well-behaved forecasts. Less behaved data requires additional information or alternative modeling techniques in order to explain the behavior. Sometimes forecasts are little better than random guessing. But let’s look at the scenarios where forecasting has proven useful.

Who Needs Forecasting?

Forecasting is a well known tool in the business world where it is used heavily in financial and operational planning. If you are buying parts or products from an offshore factory, and it takes 8 weeks for them to manufacture and deliver to your door, then you want to have some idea of how many “widgets” you need to buy. If you are a financial portfolio manager, you want to know the best time to buy, sell, and trade your assets. If you are the CEO of a company, you are interested in knowing the forecasted financial health of your business. Below are some of the most common forecasting use cases that are in practice today:

  • Retail Operations: Predicting demand for a product in a physical or online store.
  • Service Operations: Predicting airline flight capacity or taxi fleet activity.
  • Warehousing: Predicting raw material requirements and inventory SKU counts.
  • Staffing: Predicting workforce requirements and hiring activities.
  • IT: Predicting IT infrastructure utilization and compute requirements.
  • Internet: Predicting web-traffic patterns.
  • Business: Predicting revenue, sales, expenses and cash flow.
  • Industrial IoT: Predicting machine maintenance and failure.
  • Financial: Predicting financial indicator performance from stocks, bonds, funds, and exchange rates.

Properties of Time Series Data

When we talk about time series there are a few properties we need to look for and take into consideration when we formulate a forecast.

Stationary Data

Stationary data is data that maintains the same statistical distribution over time with a constant mean and standard deviation. We can identify stationary data by looking at the run sequence plot and observing whether or not a trend line exists or if there are any seasonal patterns. In forecasting, it is preferred to work with stationary data. If the data is non-stationary, then we perform a differencing operation that transforms the data points to reflect the differences between input values at time intervals.

When we make our data stationary we are essentially removing the effect of time and turning our dataset into a standard statistical distribution that is much easier to work with and build statistically based forecasting models with.


Identifying a trend is easy to do from a run sequence plot. Is the data increasing or decreasing over time? If there is a lot of noise in the data you can perform a smoothing operation (such as moving average) to make it easier to identify.


Seasonality is when cyclical patterns are detected in the data. There are many ways to verify and check for seasonality. It can often be identified with a simple run sequence plot, but you may also want to check with a seasonal subseries plot or multiple boxplots. Sometime an autocorrelation plot can also indicate seasonality.

Autocorrelation Plot Indicating Seasonality
Seasonal Subseries Plot Indicating Seasonality


Autocorrelation occurs when data is correlated to itself, meaning the value of any given data point is correlated to preceding values at a specified time interval. This is often the case in time series. You can evaluate autocorrelation by looking at an autocorrelation plot or performing a Durbin Watson test.

The plot shown below is an autocorrelation plot with a 95% confidence band. This plot shows a strong, positive, and slow-decay autocorrelation relationship between time intervals.

Partial Autocorrelation

Partial autocorrelation is slightly nuanced from total autocorrelation in that it takes into account the correlations of preceding lag values and removes them from the equation. A partial autocorrelation plot strives to show the true or absolute correlation of a lag on any given observation.

The plot shown below indicates significant positive correlations at lag 1 and 2, with a (potential) negative correlation at lag 10. The correlations at other lags are assumed as indirectly contributing to the total correlation at that time.

Preparing Time Series for Forecasting

The initial task before we start modeling is to understand the nature of the dataset and see if we reduce it down to its most simple, elemental pieces. This is called decomposition.


Decomposition is the process of extracting out all the information from a time series that can be explained by trends, seasonality, and cyclical patterns. What is left should be random and statistically stationary.

The goal of decomposition is to model each of the components. A decomposition model can be assembled as an additive or multiplicative model. An additive model is used when variations around the trend do not vary with the level (or amplitude) of the observation. If a proportional relationship exists then a multiplicative model would be an appropriate choice.

Forecasting with decomposition may or may not work well. It all depends on the behavior of the data. However it is a good place to start to understand your dataset and can lead you to selecting the right forecast model.


Differencing is a method of transforming a time series to remove seasonality and trends in order to make the dataset stationary. Differencing can be performed multiple times on the same dataset to remove both seasonality and trend cycles.

Data differencing is an important transformation technique that does two things: It makes the data stationary and stabilizes the mean of the time series.

Time Series Forecast Modeling Techniques

There are a lot of forecasting techniques and each technique comes with its own set of variations that are used depending on the data. Below I provided a list of extremely brief descriptions of each of the most common forecasting techniques that are used in practice today.

The ARIMA Family

  • AR(p) – Autoregressive:  The output variable depends linearly on its own previous values of a stochastic term. Used on univariate data that is not always stationary. Parameter  ‘p’ is the number of time-unit lags that significantly impact the current observation value.
  • VAR(p) – Vector Autoregressive: The output is a generalized AR model that captures the linear interdependencies among multiple time series by allowing for more than one evolving variable. Basically, a version of AR that allows for multiple time series to be expressed in a vector. It is used on multivariate data that may, or may not, be stationary. Parameter  ‘p’ is the number of time-unit lags that significantly impact the current observation value.
  • MA(q) – Moving Average: The output variable depends linearly on the previous values of a stochastic term. Think of it as a “filter” that reduces noise in the data. Parameter ‘q’ is the window size of the moving average function. Variations include “simple”, “cumulative”, “weighted”, and “exponential”.
  • ARMA(p, q) – Autoregressive Moving Average: The output variable is a weighted sum of past error terms. Works well for short term forecasting. Parameter  ‘p’ is the number of time-unit lags that significantly impact the current observation value and parameter ‘q’ is the window size of the moving average function.
  • VARMA(p, q) – Vector Autoregressive Moving Average: Output is a generalized multivariate ARMA model. Parameter  ‘p’ is the number of time-unit lags that significantly impact the current observation value and parameter ‘q’ is the window size of the moving average function.
  • ARIMA(p, d, q) – Autoregressive Integrated Moving Average: Similar to an ARMA model but uses differencing to make the data stationary.  An ARIMA model can be viewed as a filter that separates the signal from the noise, such that the signal can be extrapolated into the future to obtain forecasts. Parameter  ‘p’ is the number of time-unit lags that significantly impact the current observation value, parameter ‘q’ is the window size of the moving average function, and parameter ‘d’ is the differencing step size.
  • VARIMA – Vector Autoregressive Integrated Moving Average: Similar to a VARMA model but adjusts for trends in the input data by using differencing.
  • SARIMA(p, d, q)(P, D, Q)m – Seasonal Autoregressive Integrative Moving Average: Similar to an ARIMA model but has the ability to account for seasonality and repeating signals. Parameters ‘p’, ‘d’, and ‘q’ are the lags, differencing, and window size specific for addressing the trend in the data where ‘P’, ‘D’, ‘Q’ and ‘m’ are the seasonal parameters with ‘m’ as the seasonal time step for a single seasonal period.
  • FARIMA or ARFIMA – Fractional Autoregressive Integrated Moving Average: Similar to an ARIMA model but allows for non-integer values of the differencing parameter. Good for long-range forecasting on non-stationary data.  
  • ARCH & GARCH- (Generalized) Autoregressive Conditional Heteroskedasticity: Output describes the variance of the current error term as a function of the actual sizes of the previous periods’ error terms. In a retail inventory application, a forecaster would use GARCH to measure the uncertainty of the sales forecasts and use that as a safety stock value.

The Exponential Smoothing Family

  • ETS(𝞪) – Exponential Smoothing: The output variable is a weighted sum of past terms, with an exponentially decreasing weight applied to all past observations. Think of it as a low-pass filter to remove high-frequency noise. Requires stationary data and a smoothing parameter alpha ‘𝞪’ (between 0 and 1).
  • Double ETS – Double Exponential Smoothing: A double recursive version of ETS that removes trends in the data.
  • Triple ETS – Triple Exponential Smoothing: A triple recursive version of ETS that removes seasonality in the data.

The Neural Net Family

  • RNN – Recursive Neural Net: RNNs Randomly sample training and test windows from the dataset while feeding in lagged data to account for seasonality.
  • MDN – Mixture Density Network: Probabilistic deep neural network where the output is a mixture of Gaussians for all points in the forecast horizon. Good for data with seasonality and a large number of peaks and valleys. Large spikes are an indication of having many associative variables.
  • LSTM – Long Short-Term Memory Network: A type of RNN that has a long-term memory capability, controlled by a set of “gates”, that eliminates the vanishing gradient problem that is typical of RNNs. The vanishing gradient problem is also referred to as “exploding” gradients.
  • TDNN – Time Delay Neural Network: Feed-forward Neural Network breaks the input into chunks and feeds them in one at a time.

Others Forecasting Models

  • Non-Parametric Time Series:Predicts the future value distribution of a given time-series by sampling from past observations. Useful when data is intermittent or sparse and can handle data with variable time deltas.


The time component of time series data poses a unique challenge for a data scientist and it requires them to take a different approach to forecast modeling. In this article, I briefly introduced the topic of time series and reviewed common properties and techniques to consider. In the next installment (Part II) I will go into model selection and evaluation with a walk-through example.

How to Deploy a Machine Learning Risk Classification Model with Amazon AWS

How to Deploy a Machine Learning Risk Classification Model with Amazon AWS

In this article, I am going to walk through a high-level example of how to deploy a machine learning model using Amazon Web Services (AWS). The beauty of using AWS is that all of their services are pay-per-use with extremely competitive rates. For a data scientist just playing around and learning the AWS suite of services there are many ways to implement model deployment architectures at minimal cost.

Risk Modeling with Application Data

The architecture proposed in this article is based upon a use case where a company (let’s call them Company X) collects a lot of user or application data coming from various sources. Below are all the ways in which Company X collects application data:

  • Webform: Company X has a website where users log-in, create a profile and fill out a webform application. This is the most common method applicants use.
  • Mobile App: Since there are more people in the world with a mobile phone than a computer, Company X decides to develop a mobile app service. This is a popular choice among a growing segment of applicants.
  • ChatBot: Company X also has a chatbot service deployed on their website. Applicants submit their information to a bot agent via a chat dialog. The bot agent collects applicant information and provides an interface for answering commonly asked questions throughout the application process.
  • By Phone: Applicants can phone in their application to a call center that utilizes both live agents and voice-bot agents. Voice-bot agents collect the bulk of the applicant data, but there are also live agents available to assist.
  • By Mail: Applicants also have the option of printing out and submitting a paper application. The paper application is manually entered into the back-end system for processing.

At Company X, applicant data is ingested and temporarily stored in a DynamoDB NoSQL database. Applications are streamed and routed (via Amazon Lambda) into an Amazon S3 Bucket and an Amazon Redshift database.

Amazon S3 Bucket storage is a great option for storing data that is frequently accessed and for performing fast ad hoc queries with Amazon Athena. It’s important to remember that S3 stores data as objects and is not a relational database.

Amazon Redshift, on the other hand, is a flexible and scalable data warehouse that is optimized for aggregate analytical queries using SQL. You can choose to link your own SQL client tools or you can use the built-in Query Editor to perform fast complex queries.

Example Architecture for Data Ingestion with Cloud Hosted Analytics on AWS

Machine Learning and the Risk Auditing Process

Company X decides to develop a machine learning algorithm that scores applicants based on a three-tiered risk category (low, medium, and high risk). The categories are delineated based on the probability of an applicant achieving some desired (or undesired) state.

Risk modeling can be applied to many industries and in many different scenarios. Any time a company is interested in managing a particular outcome there is an opportunity to apply machine learning and advanced analytics to measure and predict the outcome. In the case of applicant data, you can achieve savings and efficiency by using machine learning algorithms to assist in making business decisions, such as whether or not an applicant should be accepted or rejected.

In this scenario, we want to train a model that will simulate the manual auditing process of accepting or rejecting an application. Applicant training data is tied to a label that marks the applicant status. The status can be a multi-tiered grade or a binary yes/no decision. The machine learning algorithm ‘learns’ over time how auditors make their decisions and attempt to mimic their decision-making process when given a new application.

Step 1) Architect the Data Ingestion Pipeline using API Gateway

API Gateway is an easy and secure way to monitor and maintain your data ingestion process. It’s a pay-per-use service that keeps track of all your API communication and has a global reach. You can set up an API Gateway to receive data from all of your data ingestion touchpoints. Mobile and desktop clients, IoT devices, and bot services in voice, phone, and text are all sources of data that you can incorporate into your data ingestion architecture.

In our example, we are interested in using Amazon API Gateway for its log data, which will provide additional business insights and additional metrics from which to build better and more robust models from.

Step 2) Build a Model & Launch an Endpoint in SageMaker

Amazon SageMaker is a machine learning platform that is equipped with a ton of features designed to streamline the machine learning development and launch process. Machine learning is hard but creating the infrastructure to support the process doesn’t have to be. Once you develop an algorithm in SageMaker, training and launching a model endpoint API in the cloud is as easy as writing a few lines of code.

SageMaker is a fully managed service that will handle all the messy business of provisioning containers and hosting your model endpoint for you. Amazon will take care of a number of tasks associated with provisioning a model into production and optimizing the required infrastructure according to your needs. For example:

  • Inference Pipelines: Models are often not standalone entities, but ensembles and stacks of data transformed multiple times over before the final result is inferred. SageMaker provides an easy way to orchestrate multiple inferences and transformations into a single pipeline.
  • Auto Scaling: Set a CloudWatch alarm to trigger additional computing power by spinning up additional EC2 instances when you get a spike in model inference calls. This helps if you are interested in making real-time inferences and you have an unpredictable request load.
  • Elastic Inference (EI): Do you have a computationally heavy model inference but don’t like the high cost of hosting your endpoint on a GPU? Amazon EI is a service that will speed up the throughput of your deep learning models by provisioning an “accelerator” compute resource at the time the model is called. In many cases, you can ditch the GPU.
  • Neo: Neo can optimize your model across TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost for deployment on ARM, Intel, and Nvidia processors. Train your model once and use it anywhere.

Another benefit to using SageMaker is that it promotes transparency and collaboration by centralizing model development activities on a single platform. If you are a manager or owner and you are interested in monitoring your Data Science team’s activities and utilization, then using SageMaker in conjunction with IAM, CloudWatch, and CloudTrail can give you insights into what data is being accessed, who (or what) is accessing that data, how it is being used, and how much are these activities costing.

If you are a data scientist, you will appreciate how easy it is to organize, annotate, and share your work with colleagues and managers by using the built-in jupyter notebooks. The environment you configure in the notebook stays in the cloud, making it easy to pick up where you left off or collaborate on a model without having to deal with environment configurations.

Step 3) Create ETL Functions in Lambda

Amazon Lambda is a serverless compute option as part of Amazon’s AWS compute suite of services. Think of them as the ‘neurons’ where the AWS solution architecture is this the ‘brain’. Lambda functions don’t require any provisioning of resources. They are on-the-fly pieces of code used for quick and simple function execution and routing of data.

In our example, we use Lambda to perform the ETL pre-processing steps required to make the data readable to the model endpoint. ETL stands for extract, transform, and load. We are extracting the data from the source (the API Gateway), transforming the data into a readable format for the model, and loading the data into the model endpoint with an API call.

Step 4) Use DynamoDB as an Intermediate Database

Amazon’s NoSQL database service, DynamoDB, is a great way to manage streaming and unpredictable data. All transactions are preserved because each call to Dynamo DB is a separate and recordable event. If you enable the ‘streams’ feature then you get the functionality of Amazon Kinesis Firehose, allowing you to handle spikes in traffic by turning your data streams into a parallel compute MapReduce job.

In our example, Company X writes data directly into a DynamoDB, which is a valid solution. However, if Company X wants to deploy a machine learning model inference into the application process, I would recommend doing so before the data is written to DynamoDB. Why? For two reasons: 1) because you can take advantage of real-time analytics on model inference calls, and 2) it gives you the option to provide immediate feedback to the user or client if you choose to. This may or may not be possible depending on the compute needs of the model inference, but it is preferable if the option is available to you.

Example Architecture for Real-Time Model Inferencing on AWS

Advantages & Benefits

By integrating a machine learning risk classification model at the front of the data ingestion pipeline, Company X is able to make rapid decisions, speed up their auditing process, and gain more granular insights into the application process.


Interested in incorporating AI and machine learning into your business but don’t know where to start or how to find a data scientist? Have a heavy data science workload and need additional resources to help bring your AI projects to fruition? can help you start or scale your data science initiatives. Learn more at

Applying AI to UAV Data for Oil & Gas Pipeline Inspections

Applying AI to UAV Data for Oil & Gas Pipeline Inspections

What happens when you marry the $37 billion pipeline inspection industry, the $1 billion UAV industry, and the $4 billion AI industry? An opportunity for massive disruption. There are over 2.5 million miles of oil and gas pipelines in the U.S. alone, and most of it is over 50 years old. This requires a considerable maintenance and inspection effort. Today, pipeline failures result in $390 million in annual costs, with over 40 serious incidents per year and over 1000 injuries and fatalities in the past two decades.

Historically, pipeline inspection has been a tedious, manual process. Helicopter inspections have greatly accelerated the process, but at $3000 per hour, a better answer was needed. That has come in the form of drones or UAVs. UAVs range in size from 1st and 2nd generation with simple video recording and still photo capabilities, and manual piloting control, to the new powerful 7th generation UAVs designed for commercial use, with fully compliant safety and regulatory standards-based design, platform and payload interchangeability, automated safety modes, enhanced intelligent piloting models and full autonomy, full airspace awareness, and auto action (takeoff, land, and mission execution).

UAVs have significantly lower costs per mile inspected, lower operational risks, are easily re-tasked, provide better imagery, reduce human expenses and exposure, and are far more environmentally friendly. Additionally, they can operate in inclement weather and day and night. The new larger drones are also able to carry a wide variety of sensors, including visible and multi-wavelength still, stereo and video cameras, thermal and near infrared imaging, LiDAR, Radar, and laser gas detectors and fluorosensors.

These sensors generate huge amounts of data, which are ideally suited for artificial intelligence and machine learning. AI can be used to examine the sensor data in real time or asynchronously and quickly alert operators to the primary problems that plague oil and gas pipelines: corrosion (50%), mechanical damage/third-party incidents (20%), storm and mud slide damage (12%), and material or equipment failure (9%). is ushering in the cognitive revolution with artificially intelligent solutions for UAVs in the Oil & Gas industry. Our approach intakes live or asynchronous UAV (or other ROV) data from Optical, Thermal, Near Infrared, LiDAR, Multispectral Analysis and Methane sensors. Raw sensor data is then prepared, including tagging and classification to prepare it for AI processing, and then processed by AI models trained and tuned specifically for pipeline and oilfield inspection use cases. Alerts generated by the AI model are then pushed via API to the customer’s Control Center dashboard or the UAV operator for immediate response.

Smart car inspections using AI

Smart car inspections using AI

I’m always looking for use cases for AI and Machine Learning. I spoke with a good friend (Dane) about a use case with car inspections, his idea by the way! The problem for a car leasing company is that they had to perform a 10 point inspection of a car and log this information into backend systems. This included pictures of the car or sometimes just a manual visual inspection. Often if the car had a dent greater than a certain size they had the unfortunate job of telling the customer that they owed money to fix it in order to end the lease. This created a poor customer experience because they would complain that the car already had the dents before they leased the car. Additionally the information that was captured had to be keyed into backend systems which caused an additional administrative burden.


A better experience would be for the inspector to help the customer and find out how they can help them with their next car purchase instead of getting bogged down inspecting a car. The solution uses 4x360 cameras on the car and capture pictures before the car is inspected and after the car is returned. This data would then be processed by machine learning to recognize these dents without a human operator. Furthermore you could send these pictures to the customer so they could see the before and after. This helps to build trust with the customer.

To make this work a special area in the car leasing company would be created to pull in the car to capture the 360 picture. These would be high quality pictures that captures every part of the car. These would then be stitched together to form a 3d image that can be zoomed in/out and navigated. This process would be the same for the after picture as well. The question is why not just do this for the after picture to highlight only the damages pieces. The reason is you want to be transparent with the customer.

Once the pictures are taken the system applies a custom classifier machine learning model that detects dents on the car and provides an estimate of how much it will cost to fix this based upon historical data. The system then also uses Robotic Process Automation (Maybe AWS Robomaker, UIPath, etc.) to update legacy backend systems so the car leasing company doesn’t need to rip out these systems for this to work. The customer then gets a notification of the issues and potential options of how to handle this. They can see the before and after with a marked up images. This also reduces the follow up calls that the company gets from the customer.

Architecture (AWS)

Challenges to overcome. You need a decent image classification library based upon car and make and have separate models for each. My reasoning here is that each type of car and model will be so different. If you have Turk in the solution then you have a way of a human in loop type labelling operation. The recent “AWS Sagemaker Ground Truth” service from AWS could simplify this because it’s a single service that includes auto labeling and human in the loop.

Other Applications for Solution:

  1. Insurance Claims. Use the solution to do after pictures : You could create an instant insurance payout amount and reduce costs of manual underwriting.
  2. Car Rental. When you return a car it performs this service : Benefits are that you can reallocate headcount.

Further reading:

Build a chatbot in less than 10 minutes

Build a chatbot in less than 10 minutes

With the new technology frontier of AI reaching new heights weekly, many businesses are trying to determine the best practical AI use cases for meeting business goals.

A common client concern is losing customers through abandoned shopping carts. Of shoppers using a desktop, 67% abandon their cart and the number surges to 97% for mobile shoppers. Intercepting potential customers with a simple chatbot during the search / quote action can dramatically increase purchases. In 2016 Just Eat saw a 266% increase in conversion rates by through adding a chatbot to their site.

“Simple” may sound like an exaggeration to those unfamiliar with chatbots. However, you may be pleasantly surprised to discover that there are numerous platforms out on the market such as API.AIBotKitPandorabotsChatfuel and Amazon Lex  that enable users to spin up chatbot in less than a day. Even non-technical teams can easily take action.

Today we’re going to be featuring one of the platforms our team has found to be the fastest plug-and-play chatbot solution: the Microsoft QnA platform. The QnA bot is so simple because it ingests existing FAQ content and uses the responses to reply to customers in real time. The platform utilizes a pluggable architecture that makes it easy to integrate with any site without expensive implementation fees. See the full step by step tutorial below with an example of a bot for the Marriott hotel group:

Build A Chatbot In Under An Hour: Microsoft QnA Bot Tutorial

  1. Visit the Microsoft QnA Platform site.
  2. Click “Create new service” in the header navigation menu.
  3. Name your service (chatbot) and share your FAQ page.
    • Note: The platform will recognize FAQs from a URL, supported file such as .pdf or .xlsx, or you can start from scratch by typing up your questions and answers.

4. On the next page you will see a digest of your knowledge base. At this step you can edit any QnA pairs or add a new QnA pair.

5. Once you are pleased with your knowledge base, click “Test” in the left hand navigation. Here you will be able to engage directly with the bot to add phrasing alternatives and define preferred responses.

  • Note: Remember to click the”Save and retrain” button once you have completed your updates.
  1. Once you are satisfied with your knowledge base click “Publish”.
  2. Use the HTTP request code to deploy your bot directly to your site.

That’s it! Spinning up a chatbot is really that easy. You can always enhance your bot by adding additional URLs, download chat logs and upload chat logs. Some important recommendations for success in roll-out:

  • Keep user experience the number one priority. The last thing you would want to do is frustrate your client before they make a purchase.
  • Be transparent that your customers are engaging with a chatbot.
  • Test your bot throughly and incorporate as many common phrases or questions as possible.

If you’d like additional instruction, Microsoft has a video tutorial available here and extensive documentation available here.

Interested in support for your chatbot implementation? Contact us at

How to understand AI Terms

How to understand AI Terms

Machine Learning is one of the most well-known subcategories of artificial intelligence. At a high level computers learn just as human do through 1. ingesting 2. learning 3. categorizing and 4. taking action. For example computer image recognition, similar to human vision, is a specific type of Machine Learning that uses Deep Learning. Deep Learning removes human computation from the loop. When a computer is shown a picture of a car, it may not know how to initially makes sense of the data. Over time, with more images of cars, the computer is able to accurately label the images.

Photo credit: Prowess Consulting

Machine Learning is one of the most well-known subcategories of artificial intelligence. At a high level computers learn just as human do through 1. ingesting 2. learning 3. categorizing and 4. taking action. For example computer image recognition, similar to human vision, is a specific type of Machine Learning that uses Deep Learning. Deep Learning removes human computation from the loop. When a computer is shown a picture of a car, it may not know how to initially makes sense of the data. Over time, with more images of cars, the computer is able to accurately label the images.

Photo Credit: XenonStack

The feature extraction and classification seen in the image above is known as a Neural Network. An artificial neural network is an interconnected group of data points, similar to the network of neurons in a brain.

Using statistical conclusions to find patterns in data is known as Data Science. Statistical Machine Learning uses the same math as Data Science, but integrates it into algorithms that get better on their own. Most AI actions are initiated through Algorithms, procedures or formulas for solving problems, based on conducting a sequence of specified actions. There are numerous algorithms used by Data Scientists. We’ll dive in deeper to each of them in a future article, but here is a chart of their frequency of use:

Photo Credit: KD Nuggets

Decision Tree algorithms composed of hard if-then rules were the initial tools used for Natural Language Processing (NLP). NLP overlaps with Machine Learning under the artificial intelligence umbrella, but focuses on associative connections between written or spoken languages. Probabilistic statistics are now the primary algorithms used for NLP as they allow more room for creativity in associating similar words.

Finally it’s important to understand the difference between supervised and unsupervised learning. Supervised Learning uses data with known labels to create models then makes predictions based on the input data. Unsupervised Learning works of unlabeled data to differentiate the given input data.

Photo Credit: Leonardo Araujo dos Santos

As you likely recognized, the key to success in AI is having a large amount of data as your foundation. From there you can work with experts like our team at to recommend the best approach to leverage that data to solve current pain points or generate new lines of business. Get in touch today by email us at