Data is everywhere, but sometimes when building a web site or testing out a product you may need quick data to use to test drive your app or product. This is where the Faker library comes to play. The idea behind Faker is quite simple to generate data randomly per certain fields. This library has been ported into several languages with similar names such as Faker.js for JavaScript, PHP’s Faker, Perl’s Data::Faker, and by Ruby’s Faker.
In this tutorial we will be building a web application using streamlit and faker. The basic idea behind the app is for generating fake data that we can download as CSV or JSON and use it for our task.
Let us see the requirements. To build this app we will need just 3 packages namely
- streamlit : For our web app
- pandas: For Previewing the data and converting into CSV and JSON
- faker: For generating our fake data
Installation of Packages
You can install these packages using pip as below
pip install streamlit pandas faker
Ok, let us check out the basic structure of our app. Our app will be having two main sections. The first section will be for generating basic simple profile and the second section will be for generating customizable or fields- specific profile where you can select how many fields you may need.
Let us check the code
# Core Pkgs import streamlit as st import streamlit.components.v1 as stc # Data Pkgs import pandas as pd from faker import Faker fake = Faker() #Utils import base64 import time timestr = time.strftime("%Y%m%d-%H%M%S")
We will create individual functions to make our work simpler.
#Fxn to Download Into A Specified Format def make_downloadable_df_format(data,format_type="csv"): if format_type == "csv": datafile = data.to_csv(index=False) elif format_type == "json": datafile = data.to_json() b64 = base64.b64encode(datafile.encode()).decode() # B64 encoding st.markdown("### ** Download File 📩 ** ") new_filename = "fake_dataset_{}.{}".format(timestr,format_type) href = f'Click Here!' st.markdown(href, unsafe_allow_html=True) # Generate A Simple Profile def generate_profile(number,random_seed=200): Faker.seed(random_seed) data = [fake.simple_profile() for i in range(number)] df = pd.DataFrame(data) return df # Generate A Customized Profile Per Locality def generate_locale_profile(number,locale,random_seed=200): locale_fake = Faker(locale) Faker.seed(random_seed) data = [locale_fake.simple_profile() for i in range(number)] df = pd.DataFrame(data) return df
Next we will create individual input to receive data from the front-end using streamlit widgets and then parse them into our function. Let us see the code for that
number_to_gen = st.sidebar.number_input("Number",10,5000)
locale = st.sidebar.multiselect("Select Locale",localized_providers,default="en_US")
dataformat = st.sidebar.selectbox("Save Data As",["csv","json"])
The received data will then be parsed into our generate_local_profile() function accordingly. Below is the code within our main function.
def main():
st.title("Fake Data Generator")
menu = ["Home","Customize","About"]
choice = st.sidebar.selectbox("Menu",menu)
if choice == "Home":
st.subheader("Home")
number_to_gen = st.sidebar.number_input("Number",10,5000)
locale = st.sidebar.multiselect("Select Locale",localized_providers,default="en_US")
dataformat = st.sidebar.selectbox("Save Data As",["csv","json"])
df = generate_locale_profile(number_to_gen,locale)
st.dataframe(df)
st.write(df['sex'].value_counts())
with st.beta_expander("📩: Download"):
make_downloadable_df_format(df,dataformat)
elif choice == "Customize":
st.subheader("Select Your Fields")
locale = st.sidebar.multiselect("Select Locale",localized_providers,default="en_US")
profile_options_list = ['username', 'name', 'sex' , 'address', 'mail' , 'birthdate''job', 'company', 'ssn', 'residence', 'current_location', 'blood_group', 'website']
profile_fields = st.sidebar.multiselect("Fields",profile_options_list,default='username')
number_to_gen = st.sidebar.number_input("Number",10,10000)
dataformat = st.sidebar.selectbox("Save Data As",["csv","json"])
custom_fake = Faker(locale)
data = [custom_fake.profile(fields=profile_fields) for i in range(number_to_gen)]
df = pd.DataFrame(data)
st.dataframe(df)
with st.beta_expander("🔍: View JSON "):
st.json(data)
with st.beta_expander("📩: Download"):
make_downloadable_df_format(df,dataformat)
else:
st.subheader("About")
st.success("Built with Streamlit")
st.info("Jesus Saves @JCharisTech")
st.text("By Jesse E.Agbe(JCharis)")
if __name__ == '__main__':
main()
We have seen how easy to build something cool using streamlit and python faker library.
You can also check out the video tutorial below and the code from here.
Thanks For Your Time
Jesus Saves
By Jesse E.Agbe(JCharis)