Faker is one of the Python libraries that helps you create fake data. This article will utilize Faker in Django to make some early data for our database. We’ll start by configuring Faker with Django and then looking at producing data.
Nest, we’ll look at how we can use Faker Localization to create our bespoke provider. It is all wrapped up in a Django custom management command that generates data for our database.
What is the purpose of Dummy Data?
For testing and operational purposes, dummy data is employed. It’s used to see how your code reacts to various inputs and test what you’ve created. The Faker library in Python helps generate dummy data. It’s an open-source library that creates dummy data in various formats.
Compatibility
Faker stopped supporting Python 2 in version 4.0.0 and now only supports Python 3.6 and higher in version 5.0.0. In the interim, if you need Python 2 compatibility, install version 3.0.1 and consider changing your codebase to support Python 3 so you can take advantage of all of Faker’s new capabilities. Please check through the expanded docs for further information, especially if you’re upgrading from version 2.0.4 or lower, as there may be breaking changes.
This package was previously known as fake-factory, but that package was deprecated at the end of 2016. Since then, a lot has changed, so make sure your project and its dependencies don’t rely on it.
How do I install the Dummy Data Faker Package?
The pip command is handy to install the Faker package as follows:
pip install Faker
Initial Setup
The Faker() method is applicable to generate and initialize a Faker Generator.
from faker import Faker as fk the_fake = fk()
You can now create any data you want, as long as you have completed the installation and setup of a Faker generator.
Creating Fake Names
To create a full name, use the name() method. You can use the methods first_name() and last_name() instead of the name() if you only want the first or last name. These methods will produce a random name for each call.
The code illustrates how these methods function.
the_fake.first_name() the_fake.last_name() the_fake.name()
In addition, you may use a for loop to create numerous names by using the name() method as follows:
for _ in range(10): print(the_fake.name())
Creating Addresses
As illustrated below, you can use the address() function to generate addresses.
the_fake.address()
Creating Random Text
You can use the function functions to generate random text. In some cases, a single paragraph can be created using the text() technique.
the_fake.text()
How to use the Faker package to create similar Dummy Data
You might want to replicate the same data collection in some circumstances. By seeding the generator, this is achievable. To create the same dummy data, use the seed() method as follows:
the_fake.seed(98) print(the_fake.first_name())
How to use the Faker package to generate unique data
You can utilize the generator’s.unique property to ensure that the created dummy data is unique. For instance, if we wish to generate 50 unique names, we can achieve the same by running the code.
names = [the_fake.unique.name() for i in range(50)]
Using the Faker Package to Generate Currency-Related Dummy Data
The following Faker() properties are used to generate cryptocurrency-related dummy data. Below are the various methods that you will find very helpful.
- cryptocurrency() — This function generates the name of a cryptocurrency. In addition, It also generates the code that corresponds to it.
- cryptocurrency_name() – is a function that generates the name of a cryptocurrency.
- cryptocurrency_code() – It generates cryptocurrency code with cryptocurrency_code().
- currency() — This function generates a currency name and a code.
- currency_name() – is a function that generates a currency name.
- currency_code() – is a function that generates currency codes.
Let’s put some of these properties into practice and see what happens.
the_fake.cryptocurrency_name() the_fake.cryptocurrency() the_fake.currency() the_fake.currency_name()
Using the Faker Package on the Command-Line
You can also use the Faker package from the command line after installing it. You can type the code directly into the command prompt.
When installed in your environment, faker is the script; in development, you may use python -m faker instead.
- -h, — show help or displays a help message
- — version: displays the version number of the program
- -o FILENAME: ensures that the output is redirected to the given filename
- -r REPEAT: This option generates a set count of output values.
- -s SEP: produces the needed separator after each generated output
- -i {my.custom_provider other.custom_provider} shows a list of additional custom providers to use. It’s important to note that this is the import path for the package that contains your Provider class, not the custom Provider class.
- fake: the name of the unreal for which output is to be generated, such as an address, an email, or text [fake argument…]: optional arguments to send to the fake, for instance, the profile, takes a list of optional comma-separated field names as the first argument.
Examples
$ faker -l de_DE address $ faker -r=3 -s=";" name
What are Providers, and what do they do?
We’ve utilized name(), first_name(), last_name(), address(), and other Faker generator properties. Many of these attributes are packed in ‘Providers.’ Providers exist as the Standard Providers, while some are community Providers because the community creates them.
Various Standard Providers assist in creating relevant fake data, such as credit card, date time, internet, person, profile, bank, and so on. More information about the entire list of Standard Providers and their properties can be found here.
Credit Score, Air Travel, Vehicle, Music, Microservice, and other Community Providers are just a few examples of the community providers. You can even make your provider and include it in the Faker bundle. More information on the entire list of Community Providers and their properties may be found here.
Each generator property (such as name, address, and lorem) is referred to as “fake.” Many of them are packed in “providers” in a faker generator.
from faker import Faker as fk from faker.providers import internet the_fake = fk() the_fake.add_provider(internet) print(the_fake.ipv4_private())
Creating a provider
from faker import Faker as fk the_fake = fk() # start by importing a similar provider or using the default one from faker.providers import BaseProvider # also come up with a novel provider class class TeacherProvider(BaseProvider): def student(self): return 'student' # then add a new provider to faker instance the_fake.add_provider(TeacherProvider) # now you can use: the_fake.student()
How do I make the Lorem Provider my own?
If you don’t want to utilize the default lorem ipsum list of words, you can offer your own. With a list of words chosen from cakeipsum, the following example explains how to do it:
from faker import Faker as fk the_fake = fk() the_word_list = [ 'beans','oat','sugar', 'Lollipop','bar','Gummies', 'cheesecake','Jelly','danish', 'pie','grains','Ice','sesame' ] the_fake.sentence() the_fake.sentence(ext_word_list=the_word_list)
Optimizations
The use_weighting argument is passed to the Faker constructor as a performance-related argument. It defines whether the frequency of values attempted to look like real-world frequencies.
If use_weighting is False, all things have an equal chance of being chosen, and the selection process is significantly sped up. True is the default value.
How To Use Faker Package To Create Localized Dummy Data
You can generate localized fake data by parsing the needed locale as an argument to the Faker Generator. In addition, multiple locations are also supported. All locales are provided in the python list data type in this situation.
faker. Faker can take a location as an argument and return localized data. If no localized provider is present, the factory defaults to the en-US locale- which means English from the United States.
Let’s write some programming to generate five UK names.
from faker import Faker as fk the_fake = fk('en_GB') for _ in range(5): print(the_fake.name())
or names from Italy
from faker import Faker as fk the_fake = fk('it_IT') for _ in range(5): print(the_fake.name())
Faker also supports multiple locales.Faker. In v3.0.0, there are new features.
from faker import Faker as fk the_fake = fk(['it_IT', 'en_US', 'ja_JP']) for _ in range(5): print(the_fake.name())
The providers’ package in the source code contains a list of available Faker locales. Localizing Faker is ongoing, and the builders need your support. They encourage creating a localized provider and submitting a Pull Request for your locale (PR).
How Do I Make A Fake Dataset With The Faker Package?
We’ll make a 100-person fake dataset with attributes such as job, company, residence, username, name, address, current location, mail, and so forth. We’ll create this data with the Standard Provider ‘Profiles’ and save it with Pandas Dataframes.
from faker import Faker as fk import pandas as pd the_fake = fk() profileData = [the_fake.profile() for i in range(50)] pandas_d = pd.DataFrame(profileData) pandas_d
Alternatives to Creating Dummy Data in Python
There are a few ways to create dummy data in Python. Below, we examine the details for some of the options available:
Fauxfactory
It can be used to quickly test your code with random fake data such as strings, numbers, dates, times, IP addresses, and so on. More information on it is found here.
Using the Random module from the Numpy package
If you need pseudo-random numbers, you can use the random package to produce them. It includes functions such as rand(), randint(), and choice ().
Factory Boy
What can you do with Factory Boy if you don’t know how to utilize it? Factory Boy comes pre-installed with Faker integration. Use the factory instead. Factory boy’s phony method:
import factory from schoolapp.models import Student class StudentFactory(factory.Factory): class Meta: model = Student title = factory.Faker('sentence', nb_words=4) author_name = factory.Faker('name')
Getting into the random instance
The generator’s .random property returns the random instance. The values were generated at random as follows:
from faker import Faker as fk the_fake = fk() the_fake.random the_fake.random.getstate()
By default, all generators use the same random instance.
from faker.generator import random helps access randomly. Plugins that want to implement all faker instances may find this beneficial.
How to generate unique values
You can ensure that any generated values are unique for this specific instance by using the generator’s.unique attribute.
from faker import Faker as fk the_fake = fk() the_names = [the_fake.unique.first_name() for i in range(50)] assert len(set(the_names)) == len(the_names)
The call to fake.unique.clear() clears the values that have previously been viewed. Faker will issue a UniquenessException after several attempts to find a unique value to avoid infinite loops. Keep an eye out for the birthday paradox; crashes are more common than you would expect.
from faker import Faker as fk the_fake = fk() for i in range(3): # Raises a UniquenessException the_fake.unique.boolean()
In addition, .unique can be used with return values and hashable arguments only.
The Generator Seeder
When using Faker for unit testing, you’ll frequently want to generate the same data collection. The generator additionally has a seed() method that seeds the shared random number generator for convenience. Using the same version of faker and seed to call the same methods yields the same results.
from faker import Faker as fk the_fake = fk() the_fake.seed(3323) print(the_fake.name()) # Susan Johnson
Each generator is set to its random.Random number generator. By utilizing the seed_instance() function, which works in the same way as the shared one, you can create a random instance that is separate from the shared one. Consider the following scenario:
from faker import Faker as fk the_fake = fk() the_fake.seed_instance(3323) print(the_fake.name()) # Susan Johnson
Please remember that because we are constantly upgrading datasets, findings may not be consistent across patch versions. Ensure that the Faker version is pinned to the patch number if you hardcode results in your test.
You can seed the faker fixture with a faker_seed fixture if you’re using pytest. For further information, see the pytest fixture docs.
Conclusion
We learned how to create various forms of data using Python’s Faker library. We looked at making names, personal profiles, and currency data. We also learned how to generate unique data and reproduce the same false data. We also looked into the suppliers and discovered that creating customized data for a specific location is possible.
We can do so much more with this package. We’ve given a few examples of how to make fake data. We think it will help test your app and reduce the time spent looking for actual data.