Data Visualization
Data visualization allows us to quickly interpret the data and adjust different variables to see their effec
Why data visualization is needed?
- Ovserve the pattern
- Identify extream values that could be anomalies
- Easy interpretation
Matplotlib
- Matplotlib is a 2d plotting library which produces good quality figures
- Althought it has its orgins in emulating the matlab graphics commands it is independent of matlab
- It makes heavy use of numpy and other extension code to provide good performance even for large arrays
lets now read the dataframe using pandas,
You can download the data frame below
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
data=pd.read_csv('Toyota.csv')
print(data)
Scatter plot
A scatter plot is a set of points that represents the values obtaied for two different variables plotted on a horizontal and vertical axes
Uses of scatter plot
- Scatter plots are used to convey the relationship between two numerical variables
- Scatter plots are sometimes called correlation plots vecause they show how two variavles are correlated
plt.scatter(data['Age'],data['Price'],c="red")
plt.title("scatter plot of price vs age of the cars")
plt.xlabel("age(months)")
plt.ylabel("price(euros)")
plt.show()
using scatter function you can give x-axis,y-axis,color Syntax : plt.scatter(x-axis,y-axis,c=”color”)
title is used to give title for the plot and xlabel,ylabel are used to give title for x and y axis
Histogram
- It is a graphical representation of data using bars of diffrent heights
- Histogram groups numbers into ranges and the height of each bar depicts the frequency of each range or bin
Uses of histogram
To represent the frequency distribution of numerical variables
plt.hist(data['KM'] , color='green' , edgecolor='white' , bins=5)
plt.title('histogram of kilometer')
plt.xlabel("kilometer")
plt.ylabel('frequency')
plt.show()
Bar plot
A bar plot is a plot that presents categorical data with rectagular bars with lengths proportional to the counts that they represent
Uses of bar plot
To represent the frequency distribution of categorical variables
A bar diagram makes it easy to comapre sets of data between different groups
counts=[979,120,12]
fueltype=('petrol','diesel','cng')
index=np.arange(len(fueltype))
plt.bar(index,counts,color=['red','blue','cyan'])
plt.title("bar plot for fuel types")
plt.xlabel("fuel types")
plt.ylabel('frequency')
plt.xticks(index,fueltype,rotation=90)
plt.show()