The ability to predict and forecast future events and outcome is essential to any business and organization.
Fortunately there are several tools and procedure to enable us to do so. One of these procedures is time series analysis. So what is time series analysis?
First of all let us define a time series and then what time series analysis is?
A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data.
Time series analysis refers to the analysis of a time series to extract meaningful insight in order to identify trends and pattern and in order to forecast future events.
Time series forecasting is the use of a model to predict future values based on previously observed values.
Applications of Time Series Analysis
- To identify trends or patterns
- To identify seasonal changes and cycles
- To understand the past in order to predict the future
- For forecasting future events,outcomes
- Ex -useful for forecasting market for products that fluctuate seasonally, such as commodities and clothing retail businesses
Now let us see some packages we can use to do time series analysis in Python and in Julia.
For Python
- Pandas: for data manipulation using datetime features
- Statsmodels: A collection of tools and algorithms for doing statistical modeling and time series analysis
- FbProphet(Facebook Prophet): A simple and easy package made by Facebook for time series forecasting.
- PmProphet: A similar package to fbprophet for time series forecasting
- Pyflux
For Julia
- DataFrames.jl : for data manipulations
- StatsModels.jl
- TimeSeries.jl : a lightweight framework for working with time series data in Julia.
Working with Facebook Prophet
Facebook Prophet utilizes an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects for forecasting time series data.
To Install
pip install pystan
pip install fbprophet
Steps/Workflow For Using FB Prophet
- Initialize Model :: Prophet()
- Set columns as ds,y
- Fit dataset :: Prophet().fit()
- Create Dates To predict :: Prophet().make_future_dataframe(periods=365)
- Predict :: Prophet().predict(future_dates)
- Plot :: Prophet().plot(predictions)
Let us start .
# Load EDA Pkgs
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# Load FB Prophet
import fbprophet
dir(fbprophet)
# Load our Dataset
df = pd.read_csv("flights_data.csv")
df.head()
df.plot()
#yt = yt -y(t-1)
df['no_of_flights'] = df['no_of_flights'] - df['no_of_flights'].shift(1)
df.plot()
from fbprophet import Prophet
# Features of Prophet
dir(Prophet)
# Initialize the Model
model = Prophet()
Parameters
- growth: linear/logistic
- seasonality:additive/multiplicative
- holidays:
- changepoint:
df.columns
# Works with a ds and y column names
df.rename(columns={'Dates':'ds','no_of_flights':'y'},inplace=True)
df.head()
df = df[1:]
df.head()
# Fit our Model to our Data
model.fit(df)
# Shape of Dataset
df.shape
# Create Future Dates of 365 days
future_dates = model.make_future_dataframe(periods=365)
# Shape after adding 365 days
future_dates.shape
future_dates.head()
# Make Prediction with our Model
prediction = model.predict(future_dates)
prediction.head()
Narrative
- yhat : the predicted forecast
- yhat_lower : the lower border of the prediction
- yhat_upper: the upper border of the prediction
# Plot Our Predictions
model.plot(prediction)
Narrative
- A Trending data
- Black dots : the actual data points in our dataset.
- Deep blue line : the predicted forecast/the predicted values
- Light blue line : the boundaries
# Visualize Each Component [Trends,Weekly]
model.plot_components(prediction)
Cross Validation
- For measuring forecast error by comparing the predicted values with the actual values
- initial:the size of the initial training period
- period : the spacing between cutoff dates
- horizon : the forecast horizon((ds minus cutoff)
- By default, the initial training period is set to three times the horizon, and cutoffs are made every half a horizon
# Load Pkgs
from fbprophet.diagnostics import cross_validation
df.shape
cv = cross_validation(model,initial='35 days', period='180 days', horizon = '365 days')
cv.head()
Performance Metrics
from fbprophet.diagnostics import performance_metrics
df_pm = performance_metrics(cv)
df_pm
Visualizing Performance Metrics
- cutoff: how far into the future the prediction was
from fbprophet.plot import plot_cross_validation_metric
plot_cross_validation_metric(cv,metric='rmse')
You can also get the code and the dataset here.
To get a video tutorial you can check the video below on youtube.
Thanks for reading.
Below are some interesting books for mastering Data Science and Machine Learning in python
Python Cookbook
Python For Data Analysis :
Python Data Science HandBook
Python Machine Learning by Sebastian Raschka
Hands On Machine Learning with Scikit-Learn & TensorFlow
Mastering ML with Scikitlearn
Monetizing ML
Thanks For Your Attention
Jesus Saves
By Jesse E. Agbe(JCharis)