Zipf Distribution in Python

In this tutorial you will learn:

• What is Zipf Distribution?
• Zipf Distribution Implementation in python
• Visualization of Zipf Distribution

Zipf Distribution

Zipf Distribution is a discrete pareto distribution also known as Riemann zeta distribution. It is specified by probability mass function. Zipf distribution samples the data based on Zipf’s law which refer to the fact that many types of data studied in the physical and social sciences can be approximated with a Zipfian distribution. Zipfian distribution belongs to the family of discrete power law probability distributions commonly used in linguistics, insurance and the modelling of rare events. The graphical pattern of Zipf Distribution follows a straight line when it is plotted on a double-logarithmic diagram.

Zipf Distribution in Python

In order to implement the Zipf Distribution the random module of python’s NumPy library function provides an inbuilt function ”zipf()”. It takes in 2 mandatory parameter. The first parameter is the “size”, it is the size of array which is desired as the output from the zipf() function, it could be 1D, 2D or n-dimensional array as required by the programmer. The second parameter is the distribution parameter defined by ‘a’, it must be a unsigned float or int and must greater than 1.
In order to observe the results of Zipf Distribution, lets take an example. Here we will generate a 1D array of Zipf distribution having size 4 with distribution 1.5. In the code below we are importing the random module in the second line of the code and in the fourth line we are applying the Zipf distribution with size of output array 4 and distribution parameter ‘a’ equal to 1.5.

`#importing the random modulefrom numpy import random#applying the Zipf functionres_arr= random.zipf(size=4,a=1.5)#printing the resultsprint('1D array of size 4 having Zipf distribution with distribution parameter 1.5 :\n')print(res_arr)`

Lets take another example, in this example we will generate a 2D array of Zipf distribution having size (5,2) with value of distribution parameter equal to 2.5.
`#importing the random modulefrom numpy import random#here we are using Zipf function to generate Zipf distribution of size 5 x 2 with distribution parameter 2.5res_arr = random.zipf(size=(5,2),a=2.5)print('2D Zipf Distribution as output from Zipf() function:\n')#printing the resultprint(res_arr)`

In this example we will generate a 3D array of Zipf distribution of the size(2,3,4) with distribution parameter equal to 1.1.
`#importing the random modulefrom numpy import random#here we are using Zipf function to generate Zipf distribution of size 2 x 3 x 4res = random.zipf(size=(2,3,4), a=1.1)print('3D Zipf Distribution as output from Zipf() function:\n')#printing the resultprint(res)`

Visualization of Zipf Distribution

In this example we will visualize the Zipf Distribution with distribution parameter 2. Here we will be using the displot function of seaborn library to plot and visualize a one dimensional discrete Zipf distribution

`#importing all the required modules and packagesfrom numpy import randomimport matplotlib.pyplot as mplimport seaborn as sb#here we are using Zipf function to generate distributions of size 3000 with distribution parameter 2sb.distplot(random.zipf(size=3000,a=2), hist=False, label='Zipf Distribution')#plotting the graphmpl.show()`

Tags

Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. After downloading it, you will need a program like Winzip to decompress it.

Virus note: All files are scanned once-a-day by SourceCodester.com for viruses, but new viruses come out every day, so no prevention program can catch 100% of them.