Generate realistic data with

Does this sound familiar? You make the perfect UI, add a few entries to the database and deploy.

And then?

CSS carnage. Once you start adding more and more entries to the database the UI falls apart. This has happened to me too many times before so I started looking for a solution

Initial solution

I started filling out my forms manually a bunch more times. This didn’t scale well, it quite frankly bored me. Next up, creating scripts to add data. I wrote a script to create an User object and stuck it inside a for-loop. This created something I call the lorem-ipsum problem. You stuff your site so full of fake data it all becomes noise.

So I created a couple of different objects and cycled between them. Then it started to look a bit better.

But then? If you’re adding a selection of elements, why not create entirely random objects each time? Enter Faker


Faker is a Python package that generates fake data for you. It allows you to generate many different fields but my most used are names, addresses and telephone numbers. Even better, it supports localisation! Check out their docs here or scroll down to see how I used it.


First we need to install it with pip:

pip install Faker

You start by creating a Faker() object and you’re good to go!

from faker import Faker
fake = Faker()
# 'Lucy Cechtelar'

# '426 Jordy Lodge
#  Cartwrightshire, SC 88120-6700'

Faker gives you different data each time, straight out of the box:

for _ in range(10):

# 'Adaline Reichel'
# 'Dr. Santa Prosacco DVM'
# 'Noemy Vandervort V'
# 'Lexi O'Conner'
# 'Gracie Weber'
# 'Roscoe Johns'
# 'Emmett Lebsack'
# 'Keegan Thiel'
# 'Wellington Koelpin II'
# 'Ms. Karley Kiehn V'

Faker groups data into something called a Provider. Looking at the date_time provider we can generate random dates of birth:

>>> for _ in range(5):
...     fake.date_of_birth()
..., 1, 17), 7, 21), 4, 12), 5, 12), 8, 19)

How I used it

to_add = []
user = db.session.query(User).first()
for i in range(100):
    contact_freq = random.choice(['every_week', 'two_weeks', 'monthly', 'three_months'])
    course = random.choice(['Mathematics', 'Chemistry', 'Computer Science', 'Physics'])
    fake_datetime = fake.date_time_between(
        start_date='-1y', end_date='now').strftime('%s')
    last_interact = datetime.utcfromtimestamp(
        float(fake_datetime)).strftime('%Y-%m-%dT%H:%M:%S Z')

    params = {
        "first_name": fake.first_name(),
        "last_name": fake.last_name(),
        "email": fake.free_email(),
        "course": course,
        "description": fake.paragraph(nb_sentences=2),
        "last_interaction": last_interact,
        "interaction_frequency": contact_freq,      
        "notes": None

This will generate 100 objects with random data. Here I use faker to get random dates, names, emails and paragraphs. The date_time_between method will pick dates within a given range. Here, I create dates from up to one year ago. I then format that into UTC format for the database.

Some fields, such as contact_freq need to be from a given selection. Here, faker would not be approriate so I used random.choice. Again with the course field.

The insert runs really quickly thanks to a special function of SQLAlchemy allowing you to rapidly insert multiple records at the same time. I’ll talk about that in another post.

The results

Adding more data to my website caused my tables to display incorrectly. That was a quick fix now, but I was planning on copying the layout across the whole website - good job faker showed me before it went live.

Hopefully this short piece inspired you to play around with faker and helps you generate more realistic looking demo sites.