An updated reading list - 2020-05-27

Tags: study career

Well, seems my last blog about thing I'd like to do in NYC has been blown to bits by COVID. While in lockdown, I've been going through my reading list and curating some good content that I have come across over the years.

The list

I have looked to group by a couple of key areas:

  • Management and process
  • Processes and checklists
  • Machine learning
  • Metrics
  • Text analysis
  • SQL
  • Web
  • Data quality

Each specific article link starts with header, provides link, some tags, then a brief description of the content of the article.

It might be cool to perform a site update to have a separate section for reading list. It would also be cool to have the tags which I currently have set up for blogs could include these tags, and be sorted alphabetically.

Management and process

12 manager readmes

https://hackernoon.com/12-manager-readmes-from-silicon-valleys-top-tech-companies-26588a660afe

  • management, communication, teaming
  • Ideas on how to communicate to others how best to work together

Busy person patterns

https://hillside.net/plop/2006/Papers/Library/PLoP%20Busy%20Person%20Pattern%20v8.pdf

  • time management, get stuff done
  • Exploration of common strategies to address getting work done. Appendix especially useful.

The Guerrilla guide to interviewing

https://www.joelonsoftware.com/2006/10/25/the-guerrilla-guide-to-interviewing-version-30/

  • management
  • How to interview and hire talent

Yes, and...

https://tomcritchlow.com/2019/11/18/yes-and/

  • consulting, presenting
  • Leveraging lessons from improv acting to think on feet faster in the business realm
  • Four detailed sections not read

Improve your social skills

https://www.improveyoursocialskills.com/foundations/where-are-you-going

  • social skills, personal
  • Writing and lessons to help you reflect on your social situation and goals
  • Detailed sections not read

Why I keep a research blog

http://gregorygundersen.com/blog/2020/01/12/why-research-blog/

  • writing, learning
  • A PhD researchers reflection on writing as a valuable method of learning

Processes and checklists

The Joel test, 12 steps to better code

https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-steps-to-better-code/

  • process, software development
  • Twelve key software development practices businesses should implement

Some items from my reliability list

http://rachelbythebay.com/w/2019/07/21/reliability/

  • software development, process
  • Rachel‘a site, like Joel’s mentioned above are goldmine blog posts. Some more considerations for building reliable software

Do nothing scripting

https://blog.danslimmon.com/2019/07/15/do-nothing-scripting-the-key-to-gradual-automation/

  • process, efficiency
  • An approach to partially automate repetitive tasks and reduce working memory requirement

Machine Learning

Rules of machine learning

http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf

  • software development, process, machine learning, data analysis
  • Big guide from google about best practices for working with production machine learning pipelines

ML vs Econometrics y(x) vs betas

https://scholar.harvard.edu/files/sendhil/files/jep.31.2.87.pdf

  • data analysis, statistics, machine learning
  • Good overview of differences in approach for machine learning vs econometric analysis. Beats more important in econometrics and understanding data assumptions

Metrics

Optipedia

https://www.optimizely.com/optimization-glossary/

  • metrics
  • Great repository of different metric concepts and processes in web A/B testing

Why not accuracy?

https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models

  • metrics, data analytics
  • Great answer to why we eschew accuracy for other measures

Why you should summarise your data with the geometric mean

https://medium.com/@JLMC/understanding-three-simple-statistics-for-data-visualizations-2619dbb3677a

  • data analysis, metrics
  • Discussion of using other methods to reduce your data into an average. Would appear to work well for highly imbalanced or skewed data

Text analysis

The absolute minimum software developers should know about Unicode

https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/

  • software development, text data
  • Working with text data and conversions as a relatively new programmer can be dangerous, similar to date and Timezone.

Unicode in python, demystified

http://farmdev.com/talks/unicode/

  • software development, text data
  • Similar to the above, good resource to understand Unicode and best practice

How to code and analyse verbatim text

https://measuringu.com/code-verbatim/

  • data analysis, text data
  • Decent beginners guide to processing basic features from text data to use in other analysis (such as predicting NPS)

Self supervised representation learning in NLP

https://amitness.com/2020/05/self-supervised-learning-nlp/

  • text data, data analysis
  • Overview of more advanced concepts in machine learning for text data in text pre-processing

SQL

PostgresSQL exercises

https://pgexercises.com/

  • SQL, learning
  • Good introduction exercises for Postgres SQL, can do it all in the browser

SQL murder mystery

https://mystery.knightlab.com/walkthrough.html

  • SQL, learning
  • Interactive game using SQL, can complete in browser or with downloaded database

Web

Why do we need flask, celery, redis?

https://news.ycombinator.com/item?id=22901856

  • Software development
  • Link and comment section is good. Explains the concepts of the three technologies and how the process is similar to ordering take out food

Data quality

Starting a data quality checklist

https://medium.com/@TWB_BI/starting-a-data-quality-checklist-2d500e97ab5c

  • data analysis, data cleaning
  • Another good checklist guide of what to look for and request when working with new data sources

Quartz guide to bad data

https://qz.com/572338/the-quartz-guide-to-bad-data/

  • data analysis, data cleaning
  • Reasonable checklist for typical problems that can arise in data sources and how to proceed