Database Clinic: Neo4j is a fine online course. The content is easy to follow but it doesn’t cover enough fundamental concepts and theories on Neo4j. The Neo4J used in the video is not the most current version due to the video is published in 2017. So some lines of the code didn’t work. The Neo4J Desktop made the browsing the database easier than using web browsers. This course is good resource for exploring what graph database can do with flat data.
Epidemiology: The Basic Science of Public Health is a great fundamental course offered by the University of North Carolina at Chapel Hill. It is a 6-week course covering history of epidemiology, what and how to measure disease frequency, various study designs, measures of association, and causality.
The videos are short and concise. The quizzes are great for catching up with what I just learned from video lectures. The content is quite heavy, but the instruction is well-organized to cover all the fundamental concepts. The definitions of the measures of health outcome are still unclear on my end, but this course was very helpful in learning some basics.
This content of this book is extremely useful resource for learning and understand statistical concepts and techniques. It is great to see how Python and R codes are implemented for each concept, but only the snippet of codes are provided on many examples in the book. Fortunately, the publisher provides the whole codes by chapter. The widely used Python packages (pandas, numpy, scipy, statsmodels, sklearn, matplotlib, seaborn, and more) and R libraries can be easily located in each chapter and index.
R libraries used:
library(boot) #Bootstrap Functions library(ca) #Simple, Multiple and Joint Correspondence Analysis library(cluster) #”Finding Groups in Data”: Cluster Analysis library(corrplot) #Visualization of a Correlation Matrix library(dplyr) #A Grammar of Data Manipulation library(ellipse) #Functions for Drawing Ellipses and Ellipse-Like Confidence Regions library(FNN) #Fast Nearest Neighbor Search Algorithms and Applications library(ggplot2) #Create Elegant Data Visualisations Using the Grammar of Graphics library(gmodels) #Various R Programming Tools for Model Fitting library(klaR) #Classification and Visualization library(lmPerm) #Permutation Tests for Linear Models library(lubridate) #Make Dealing with Dates a Little Easier library(MASS) #Support Functions and Datasets for Venables and Ripley’s MASS library(matrixStats) #Functions that Apply to Rows and Columns of Matrices (and to Vectors) library(mclust) #Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation library(mgcv) #Mixed GAM Computation Vehicle with Automatic Smoothness Estimation library(pwr) #Basic Functions for Power Analysis library(randomForest) #Breiman and Cutler’s Random Forests for Classification and Regression library(rpart) #Recursive Partitioning and Regression Trees library(tidyr) #Tidy Messy Data library(vioplot) #Violin Plot library(xgboost) #Extreme Gradient Boosting
This book is published in 2017. The fundamentals of Python language is covered in this book. O’Reilly offers companion videos.
This book is a little outdated, but great for the beginners for grasping fundamentals. The companion video uses a text editor and Python on command line environment. It is great for learning how to use command-line arguments, but not much useful anymore since Jupyter notebook became dominant in the market.