Trent Hauck

This is a simple website to share things I'm interested in and notes for myself. I hope you enjoy it. If you'd like to contact me, please email me at trent@trenthauck.com. I'm also on twitter at @trent_hauck and LinkedIn at trent-hauck.

If you're interested in hiring me, please check out my resume.

Empty Relation Propagation

2024-06-23

This is a small vignette on how to propagate empty relations in Apache DataFusion.

Tries

2024-06-07

This is a small vignette on how to implement a trie in Python.

Recent DataFusion Contributions

2024-06-03

This is a summary of my recent contributions to Apache DataFusion.

Edit Distance

2024-06-02

This is a small vignette on how to calculate the edit distance between two strings in Python.

Native Spark Accelerators

2024-06-01

These are quick notes summarizing the different Spark Accelerators.

Posterior Trace Querying

2024-05-22

This shows how you can use a database to query posterior traces to perform inference on the results.

Node Traversal

2024-05-20
https://www.trenthauck.com/node-traversal.html

This is a small vignette on how to do pre-, post-, and in-order traversal of a tree in Python.

PyMC AB Test Example

2024-05-19
https://www.trenthauck.com/pymc-ab-test.html

This is a simple example of how to use PyMC determine if there's a difference in click-through rates between two versions of a website.

The Best Way to Write Python is in Rust

2024-05-18
trenthauck.com/pycascades-2024.html

This is a presentation I've given a couple of times sharing how and why to use Rust to write Python extensions, including a special topic on Data Engineering applications with Arrow.

Adding `unhex` to Apache DataFusion Comet

2024-05-02
github.com/apache/datafusion-comet/pull/342

This was my first commit in Apache DataFusion Comet, which is a Spark Accelerator based on Apache DataFusion. There are behavioral changes between Spark 3.2 and versions thereafter. Moreover Spark 3.4 introduces a `failOnError` argument, which requires shimming different code for difference Spark versions.