Progress Log 58 (Python 3): Numpy: Basic Statistics – .mean(), .median(), .corrcoef(), .std(), Basic Plots with Matplotlib
I’m taking a course entitled “Introduction to Python for Data Science” by Microsoft at edX and all of the concepts described below come from that course.
Numpy: Basic Statistics
Using Numpy, you can calculate the following for your data (assume you imported numpy as np):
- mean: np.mean(np_2d)
- median: np.median(np_2d)
- correlation: np.corrcoef(np_2d)
- standard deviation: np.std(np_2d)
Basic Plots with Matplotlib
- data visualization tool
- .pyplot is one of the subpackages for matplotlib.
- .plot(x, y): used to plot the data. The first list corresponds to the horizontal axis, and the second list to the vertical axis. The plot function gives python all the ingredients to build a plot.
- .show(): used to actually build the plot.
- As you can see above, the years are plotted on the x-axis and the population on the y-axis.
- .scatter(x,y) is used to plot data as a scatter plot.