Posts

Data-proximate Computing with Coiled Functions

The original version of this post appears on blog.coiled.io Coiled Functions make it easy to improve performance and reduce costs by moving your computations next to your cloud data It’s common practice for data scientists and researchers to analyze data on their local work computer (often a laptop). This works great for data that’s stored locally on their machine. However, increasingly data is moving to cloud storage system like AWS S3 and Google Cloud Storage....

Distributed printing

The original version of this post appears on blog.coiled.io Dask makes it easy to print whether you’re running code locally on your laptop, or remotely on a cluster in the cloud. One of the most basic things programmers do is print text to their screen. Printing is often used for things like debugging: # ... print("Made it here...") # ... or to signal progress: for i in range(10): print(f"Done with iteration {i}") However, when running code at scale on a Dask cluster even simple print calls can become non-intuitive....

Upstream testing in Dask

The original version of this post appears on blog.coiled.io Dask has deep integrations with other libraries in the PyData ecosystem like NumPy, pandas, Zarr, PyArrow, and more. Part of providing a good experience for Dask users is making sure that Dask continues to work well with this community of libraries as they push out new releases. This post walks through how Dask maintainers proactively ensure Dask continuously works with its surrounding ecosystem....

Better living through saved replies on GitHub

Using saved replies on GitHub can improve your developer life

Triaging Bug Reports

Triaging bug reports is an important part of maintaining a library. This post lists some of my thoughts on how to triage effectively. I should note that what’s outlined here may not apply to your situation. Best practices vary depending on the size of your project, its funding model, the culture of the larger surrounding ecosystem, etc. The post is based on my experience maintaining dask and other libraries in the PyData ecosystem....