R-squared or coefficient of determination represents the proportion of the variance for a dependent variable explained by an independent variable. R-squared reflects the fitness of the derived model with its value ranging from 0 to 1. However, the presence of outliers or collinearity among predictors can affect the accuracy measure. the embedded video demonstrates the calculator and R software derivation of R-squared value. 

Python Pandas

Pandas, an open-source data analysis library for Python, was developed in 2008 and released in 2010. Pandas library has many query features in Python, applicable to a wide range of data including financial data and environmental data. The short video illustrates the application of Python Pandas to analyse hydrological data. 

Statistical functions in R and Python

Exploratory data analysis and preliminary statistical analysis require basic statistical functions. R and Python contain various packages for specific analysis and Python libraries such as pandas and NumPy contain statistical functions. Below is the collection of basic statistical functions in R and Python.

Column calculation in R

R and dplyr libraries can help calculate column values. The following example shows the mutate() function to calculate discharge (cubic meter per second) for different sites with simulated velocity(square meter) and area (meter) data.

Python Plotly library

Plotly library contains effective tools to visualise the data. Time series data for various categories can be depicted with animation bar charts. Codes can be used for the data with various categories in one column and values in another. The animated bar chart below presents the life expectancy values of the selected countries with y-axis values ranging from 0 to 100.


This product has been added to your cart