timeseriesanalysiswithfacebookprophetjcharistech

Time Series Analysis with Facebook’s Prophet

The ability to predict and forecast future events and outcome is essential to any business and organization.

Fortunately there are several tools and procedure to enable us to do so. One of these procedures is time series analysis.  So what is time series analysis?

First of all let us define a time series and then what time series analysis is?

time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data.

Time series analysis refers to the analysis of  a time series to extract meaningful insight in order to identify trends and pattern and in order to forecast future events.

Time series forecasting is the use of a model to predict future values based on previously observed values.

Applications of Time Series Analysis

  • To identify trends or patterns
  • To identify seasonal changes and cycles
  • To understand the past in order to predict the future
  • For forecasting future events,outcomes
  • Ex -useful for forecasting market for products that fluctuate seasonally, such as commodities and clothing retail businesses

Now let us see some packages we can use to do time series analysis in Python and in Julia.

For Python

  • Pandas: for data manipulation using datetime features
  • Statsmodels: A collection of tools and algorithms for doing statistical modeling and time series analysis
  • FbProphet(Facebook Prophet): A simple and easy package made by Facebook for time series forecasting.
  • PmProphet: A similar package to fbprophet for time series forecasting
  • Pyflux

 

For Julia

  • DataFrames.jl : for data manipulations
  • StatsModels.jl
  • TimeSeries.jl : a lightweight framework for working with time series data in Julia.

 

Working with Facebook Prophet

Facebook Prophet utilizes an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects for forecasting time series data.

To Install

pip install pystan
pip install fbprophet

Steps/Workflow For Using FB Prophet

  • Initialize Model :: Prophet()
  • Set columns as ds,y
  • Fit dataset :: Prophet().fit()
  • Create Dates To predict :: Prophet().make_future_dataframe(periods=365)
  • Predict :: Prophet().predict(future_dates)
  • Plot :: Prophet().plot(predictions)

Let us start .

In [55]:
# Load EDA Pkgs
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
In [56]:
# Load FB Prophet
import fbprophet
In [57]:
dir(fbprophet)
Out[57]:
['Prophet',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'diagnostics',
 'forecaster',
 'hdays',
 'make_holidays',
 'models',
 'plot']
In [58]:
# Load our Dataset
df = pd.read_csv("flights_data.csv")
In [59]:
df.head()
Out[59]:
Dates no_of_flights
0 2005-01-01 594924
1 2005-02-01 545332
2 2005-03-01 617540
3 2005-04-01 594492
4 2005-05-01 614802
In [60]:
df.plot()
Out[60]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f96e9a03d68>
In [62]:
#yt = yt -y(t-1)
df['no_of_flights'] = df['no_of_flights'] - df['no_of_flights'].shift(1)
In [63]:
df.plot()
Out[63]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f96e9492898>
In [64]:
from fbprophet import Prophet
In [65]:
# Features of Prophet
dir(Prophet)
Out[65]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'add_country_holidays',
 'add_group_component',
 'add_regressor',
 'add_seasonality',
 'construct_holiday_dataframe',
 'fit',
 'fourier_series',
 'initialize_scales',
 'linear_growth_init',
 'logistic_growth_init',
 'make_all_seasonality_features',
 'make_future_dataframe',
 'make_holiday_features',
 'make_seasonality_features',
 'parse_seasonality_args',
 'piecewise_linear',
 'piecewise_logistic',
 'plot',
 'plot_components',
 'predict',
 'predict_seasonal_components',
 'predict_trend',
 'predict_uncertainty',
 'predictive_samples',
 'regressor_column_matrix',
 'sample_model',
 'sample_posterior_predictive',
 'sample_predictive_trend',
 'set_auto_seasonalities',
 'set_changepoints',
 'setup_dataframe',
 'validate_column_name',
 'validate_inputs']
In [66]:
# Initialize the Model
model = Prophet()

Parameters

  • growth: linear/logistic
  • seasonality:additive/multiplicative
  • holidays:
  • changepoint:
In [67]:
df.columns
Out[67]:
Index(['Dates', 'no_of_flights'], dtype='object')
In [68]:
# Works with a ds and y column names
df.rename(columns={'Dates':'ds','no_of_flights':'y'},inplace=True)
In [69]:
df.head()
Out[69]:
ds y
0 2005-01-01 NaN
1 2005-02-01 -49592.0
2 2005-03-01 72208.0
3 2005-04-01 -23048.0
4 2005-05-01 20310.0
In [70]:
df = df[1:]
In [71]:
df.head()
Out[71]:
ds y
1 2005-02-01 -49592.0
2 2005-03-01 72208.0
3 2005-04-01 -23048.0
4 2005-05-01 20310.0
5 2005-06-01 -5607.0
In [72]:
# Fit our Model to our Data
model.fit(df)
/usr/local/lib/python3.6/dist-packages/fbprophet/forecaster.py:880: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead
  min_dt = dt.iloc[dt.nonzero()[0]].min()
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
Out[72]:
<fbprophet.forecaster.Prophet at 0x7f96e980c240>
In [73]:
# Shape of Dataset
df.shape
Out[73]:
(35, 2)
In [74]:
# Create Future Dates of 365 days
future_dates = model.make_future_dataframe(periods=365)
In [75]:
# Shape after adding 365 days
future_dates.shape
Out[75]:
(400, 1)
In [76]:
future_dates.head()
Out[76]:
ds
0 2005-02-01
1 2005-03-01
2 2005-04-01
3 2005-05-01
4 2005-06-01
In [77]:
# Make Prediction with our Model
prediction = model.predict(future_dates)
In [78]:
prediction.head()
Out[78]:
ds trend yhat_lower yhat_upper trend_lower trend_upper additive_terms additive_terms_lower additive_terms_upper yearly yearly_lower yearly_upper multiplicative_terms multiplicative_terms_lower multiplicative_terms_upper yhat
0 2005-02-01 -1165.542097 -56137.569128 -48840.518562 -1165.542097 -1165.542097 -51229.325339 -51229.325339 -51229.325339 -51229.325339 -51229.325339 -51229.325339 0.0 0.0 0.0 -52394.867436
1 2005-03-01 -1064.155913 68024.947887 75266.690673 -1064.155913 -1064.155913 72711.688639 72711.688639 72711.688639 72711.688639 72711.688639 72711.688639 0.0 0.0 0.0 71647.532726
2 2005-04-01 -951.906923 -27091.175977 -19899.206467 -951.906923 -951.906923 -22409.048094 -22409.048094 -22409.048094 -22409.048094 -22409.048094 -22409.048094 0.0 0.0 0.0 -23360.955017
3 2005-05-01 -843.278873 13581.351020 20774.735962 -843.278873 -843.278873 18062.172577 18062.172577 18062.172577 18062.172577 18062.172577 18062.172577 0.0 0.0 0.0 17218.893703
4 2005-06-01 -731.029888 -9191.182715 -2021.777104 -731.029888 -731.029888 -4783.426593 -4783.426593 -4783.426593 -4783.426593 -4783.426593 -4783.426593 0.0 0.0 0.0 -5514.456481

Narrative

  • yhat : the predicted forecast
  • yhat_lower : the lower border of the prediction
  • yhat_upper: the upper border of the prediction
In [79]:
# Plot Our Predictions
model.plot(prediction)
Out[79]:

Narrative

  • A Trending data
  • Black dots : the actual data points in our dataset.
  • Deep blue line : the predicted forecast/the predicted values
  • Light blue line : the boundaries
In [80]:
# Visualize Each Component [Trends,Weekly]
model.plot_components(prediction)
Out[80]:
In [ ]:

Cross Validation

  • For measuring forecast error by comparing the predicted values with the actual values
  • initial:the size of the initial training period
  • period : the spacing between cutoff dates
  • horizon : the forecast horizon((ds minus cutoff)
  • By default, the initial training period is set to three times the horizon, and cutoffs are made every half a horizon
In [81]:
# Load Pkgs
from fbprophet.diagnostics import cross_validation
In [82]:
df.shape
Out[82]:
(35, 2)
In [85]:
cv = cross_validation(model,initial='35 days', period='180 days', horizon = '365 days')
INFO:fbprophet:Making 4 forecasts with cutoffs between 2005-06-09 00:00:00 and 2006-12-01 00:00:00
/usr/local/lib/python3.6/dist-packages/fbprophet/forecaster.py:880: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead
  min_dt = dt.iloc[dt.nonzero()[0]].min()
INFO:fbprophet:n_changepoints greater than number of observations.Using 3.0.
INFO:fbprophet:n_changepoints greater than number of observations.Using 7.0.
INFO:fbprophet:n_changepoints greater than number of observations.Using 12.0.
INFO:fbprophet:n_changepoints greater than number of observations.Using 17.0.
In [86]:
cv.head()
Out[86]:
ds yhat yhat_lower yhat_upper y cutoff
0 2005-07-01 1.948018e+05 1.948018e+05 1.948018e+05 18766.0 2005-06-09
1 2005-08-01 2.467115e+05 2.467115e+05 2.467115e+05 2943.0 2005-06-09
2 2005-09-01 -1.478723e+06 -1.478723e+06 -1.478723e+06 -56651.0 2005-06-09
3 2005-10-01 7.975796e+05 7.975796e+05 7.975796e+05 18459.0 2005-06-09
4 2005-11-01 -1.422165e+06 -1.422165e+06 -1.422165e+06 -26574.0 2005-06-09

Performance Metrics

In [87]:
from fbprophet.diagnostics import performance_metrics
In [88]:
df_pm = performance_metrics(cv)
In [89]:
df_pm
Out[89]:
horizon mse rmse mae mape coverage
36 31 days 2.360641e+10 153643.779575 107080.589795 9.390467 0.25
1 53 days 3.071503e+10 175257.040266 124013.765055 27.752806 0.25
13 57 days 1.519668e+10 123274.807990 70278.591188 20.897181 0.25
25 58 days 1.519676e+10 123275.157531 70374.301170 20.915781 0.25
37 62 days 1.529483e+10 123672.252793 75285.556365 21.001878 0.00
2 84 days 5.060115e+11 711344.877443 369861.535524 6.569980 0.00
14 85 days 5.057078e+11 711131.321520 363675.818013 6.426673 0.00
26 89 days 5.057265e+11 711144.495988 365685.701271 6.454806 0.00
38 90 days 5.056965e+11 711123.436944 364860.870327 6.422381 0.00
3 114 days 1.518813e+11 389719.511827 204122.923711 10.698840 0.00
15 116 days 1.538945e+11 392293.864123 223717.125484 11.797301 0.00
27 119 days 1.538809e+11 392276.517897 222690.437978 11.790447 0.00
39 121 days 1.539220e+11 392328.893093 223788.989957 11.947126 0.00
4 145 days 4.890830e+11 699344.660837 377906.467443 14.524369 0.00
16 146 days 4.870420e+11 697883.958634 356795.446110 13.471765 0.00
28 150 days 4.870378e+11 697880.958183 356162.575273 13.449076 0.00
40 151 days 4.869478e+11 697816.470939 353128.601892 13.365524 0.00
5 175 days 1.226784e+11 350254.695772 179337.189957 28.456418 0.00
17 177 days 1.246104e+11 353001.970178 199838.773794 33.154485 0.00
29 180 days 1.246329e+11 353033.820010 201753.454191 33.265146 0.00
41 182 days 1.246279e+11 353026.728628 201446.300473 33.944912 0.00
6 206 days 4.089364e+10 202221.751284 124973.895468 16.752662 0.00
18 207 days 3.901413e+10 197519.952164 106876.654854 12.139106 0.00
30 211 days 3.899939e+10 197482.640748 105933.841468 12.096678 0.00
42 212 days 3.904900e+10 197608.190355 108043.595615 11.494869 0.00
7 237 days 8.439535e+10 290508.777794 154548.396641 3.367390 0.00
19 238 days 8.433974e+10 290413.050396 151867.564034 3.363117 0.00
31 242 days 8.435817e+10 290444.781947 152993.792083 3.321496 0.00
43 243 days 8.434777e+10 290426.866906 152653.601809 3.888585 0.00
8 265 days 7.637268e+10 276356.068359 145614.390983 2.855098 0.00
20 269 days 7.644462e+10 276486.203150 148783.942693 2.788798 0.00
32 270 days 7.642622e+10 276452.921622 147660.146001 2.762350 0.00
44 274 days 7.656877e+10 276710.625653 151005.599060 2.119942 0.00
9 296 days 8.272218e+10 287614.641825 156467.326110 7.479451 0.00
21 299 days 8.302992e+10 288149.131077 161868.158725 7.745266 0.00
33 301 days 8.304893e+10 288182.115182 163020.884310 7.832486 0.00
45 304 days 8.288088e+10 287890.398435 158662.752095 7.789264 0.25
10 326 days 7.962580e+10 282180.430297 155799.974909 8.571577 0.25
22 330 days 7.981446e+10 282514.535016 157962.737030 8.674430 0.25
34 331 days 7.978664e+10 282465.286964 155366.458781 8.569479 0.50
46 335 days 7.976320e+10 282423.803177 153781.998045 8.523398 0.50
11 357 days 7.453610e+10 273013.004877 149058.828813 30.048753 0.50
23 360 days 7.420236e+10 272401.099524 144833.373034 29.997206 0.50
35 362 days 7.420238e+10 272401.131186 144869.565863 30.028110 0.50
47 365 days 7.434315e+10 272659.398736 149838.777512 30.655850 0.25

Visualizing Performance Metrics

  • cutoff: how far into the future the prediction was
In [90]:
from fbprophet.plot import plot_cross_validation_metric
In [91]:
plot_cross_validation_metric(cv,metric='rmse')
Out[91]:

 

You can also get the code and the dataset here.

To get a video tutorial you can check the video below on youtube.

Thanks for reading.

Below are some interesting books for mastering Data Science and Machine Learning in python

Python Cookbook
Python For Data Analysis :
Python Data Science HandBook
Python Machine Learning by Sebastian Raschka
Hands On Machine Learning with Scikit-Learn & TensorFlow
Mastering ML with Scikitlearn
Monetizing ML

Thanks For Your Attention

Jesus Saves

By Jesse E. Agbe(JCharis)

Leave a Comment

Your email address will not be published. Required fields are marked *