Current Projects

Published

September 8, 2023

I generally have a couple of big projects I’m working on that are “works in progress” and haven’t made it to a blog post yet. Links to the drafts are included, but they are just drafts.

What I’m working on now:

Family History Project

Tombstones (in R)- I’m working on a project for my father that will culminate in a website for his genealogy research. There are a couple of different parts that I’m working on independently.

  • Linking photos of family gravestones to an Excel sheet that records the GPS location of the tombstones. This combined dataset will be used to generate a leaflet map (and is currently called leaflet_tester). I will probably separate this into two or three portions, though.

    • the data cleanup and the actual matching (done and posted)

    • leaflet styling (done and posted)

    • web scraping portion to add more information about some of the ancestors (done and posted)

    • My father also gave me several papers he’s written about family members. I want to write code to determine which family members are mentioned in each document and match them to his Excel sheet. (not started)

I’ve spent a surprising large amount of time today learning how to handle “large” files. The photos are added to the tombstone repo using git’s lfs if anyone wanted to run my code. There are about 2 GB of photos, compressed to ~300 MB. The map with photos is about 500 MB, which is also too large for github pages. That’s okay, since I plan to host it on a different website devoted entirely to my father’s family history research.

Credit Card Fraud

I’m also exploring the credit card fraud project in other languages. I previously wrote two R Tidymodels tutorials using this data. They can be found here and here.

By using the same dataset, I can focus on learning the new language instead of thinking the higher level aspects. And I’m able to spot errors and mistakes much more easily.

Credit Card Fraud (Tableau)- I’ve been working on replicating the credit card fraud EDA in Tableau. I have an adequate draft, but I need to go back and fix the tooltips. There were some interesting results concerning the geographic variables. I made some assumptions when I did the modeling in R that weren’t correct. I will write up some lessons I learned while working in Tableau, but I also want to see how I could have caught that while working in R.

Credit Card Fraud (Python)- I am rebuilding my credit card fraud project in Python to improve my skills. I’m also learning how to use Python within R Studio and the ins and outs of the reticulate package.