Progress Log 58 (Python 3): Numpy: Basic Statistics – .mean(), .median(), .corrcoef(), .std(), Basic Plots with Matplotlib

I’m taking a course entitled “Introduction to Python for Data Science” by Microsoft at edX and all of the concepts described below come from that course.


Numpy: Basic Statistics

Using Numpy, you can calculate the following for your data (assume you imported numpy as np):

  • mean: np.mean(np_2d[])
  • median: np.median(np_2d[])
  • correlation: np.corrcoef(np_2d[])
  • standard deviation: np.std(np_2d[])






Basic Plots with Matplotlib

  • data visualization tool
  • .pyplot is one of the subpackages for matplotlib.

Line Chart

  • .plot(x, y): used to plot the data. The first list corresponds to the horizontal axis, and the second list to the vertical axis. The plot function gives python all the ingredients to build a plot.
  • .show(): used to actually build the plot.

  • As you can see above, the years are plotted on the x-axis and the population on the y-axis.

Scatter plot

  • .scatter(x,y) is used to plot data as a scatter plot.




