Talk Python to Me is a weekly podcast hosted by developer and entrepreneur Michael Kennedy. We dive deep into the popular packages and software developers, data scientists, and incredible hobbyists doing amazing things with Python. If you're new to Python, you'll quickly learn the ins and outs of the community by hearing from the leaders. And if you've been Pythoning for years, you'll learn about your favorite packages and the hot new ones coming out of open source.
#503: The PyArrow Revolution
April 28, 2025
01:08:36
12.61 MB ( 53.53 MB less)
Downloads: 0
Pandas is at a the core of virtually all data science done in Python, that is virtually all data science. Since it's beginning, Pandas has been based upon numpy. But changes are afoot to update those internals and you can now optionally use PyArrow. PyArrow comes with a ton of benefits including it's columnar format which makes answering analytical questions faster, support for a range of high performance file formats, inter-machine data streaming, faster file IO and more. Reuven Lerner is here to give us the low-down on the PyArrow revolution.
Episode sponsors
NordLayer
Auth0
Talk Python Courses
Apache Arrow: github.com
Parquet: parquet.apache.org
Feather format: arrow.apache.org
Python Workout Book: manning.com
Pandas Workout Book: manning.com
Pandas: pandas.pydata.org
PyArrow CSV docs: arrow.apache.org
Future string inference in Pandas: pandas.pydata.org
Pandas NA/nullable dtypes: pandas.pydata.org
Pandas `.iloc` indexing: pandas.pydata.org
DuckDB: duckdb.org
Pandas user guide: pandas.pydata.org
Pandas GitHub issues: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode sponsors
NordLayer
Auth0
Talk Python Courses
Links from the show
Reuven: github.com/reuvenApache Arrow: github.com
Parquet: parquet.apache.org
Feather format: arrow.apache.org
Python Workout Book: manning.com
Pandas Workout Book: manning.com
Pandas: pandas.pydata.org
PyArrow CSV docs: arrow.apache.org
Future string inference in Pandas: pandas.pydata.org
Pandas NA/nullable dtypes: pandas.pydata.org
Pandas `.iloc` indexing: pandas.pydata.org
DuckDB: duckdb.org
Pandas user guide: pandas.pydata.org
Pandas GitHub issues: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy