Alexander Klapheke

I’m a data scientist and self-taught coder (Python, R, Haskell) with a background in linguistics and math and over 10 years’ experience explaining technical concepts to laypeople.

Foreword

264 words CC-BY View source

I should not talk so much about myself if there were anybody else whom I knew as well.

As a preface to this blog, let me introduce myself: I’m a linguist turned data scientist living in the Boston area. While a lot of my work deals with natural language, I’m interested in the gamut of data and what knowledge can be gleaned from it. I won’t comment much on pop data science or world events, but I hope that through reasoned analysis I can provide clarity about what data I can get hold of.

I entered data science after leaving academia, but the transition wasn’t abrupt. I’d learned to code some years prior, and ran experiments as a graduate student, punctiliously collecting small data sets, modeling them in R, and encapsulating the results in manuscripts and slides. The tools I use have changed (Python largely displacing R), and the data sets are some orders of magnitude larger, but my work in graduate school laid the flagstones for a data science career.

This blog will mainly comprise overviews of the data projects I’m working on, essays on how I approach data, particularly language data, and posts about how I use various tools—not just programming languages and text editors, but also models and theorems. I hope not only to showcase what insights can be loosed from a dataset (and what illusory insights come of unsound analyses), but also serve as an aid and reference to other data scientists—not least among whom, my future self. A full portfolio of my work can be found on my homepage.