Copy/Paste a DataFrame with pd-replicator

Originally published in Towards Data Science.

Data Science is extremely iterative. Those iterations can involve lots of back and forth of data between different platforms. Maybe you pull some data from your BI tool, run some analysis in Jupyter, then move the output into a spreadsheet to share with your teammates. Then you find something you need to fix and end up running the whole thing again.

The thing is, when you’re juggling this data around, you’re losing mental energy/health on something inherently boring. I built pd-replicator to help make it as simple as possible to move data out of Jupyter and into a spreadsheet, which I usually find to be the most painful part of data juggling.

Overview

pd-replicator adds a copy button to a DataFrame output in Jupyter, which lets you copy that DataFrame to your clipboard in a format that allows it to be pasted directly into a spreadsheet.

It’s a simple pip package, and requires very minimal setup to get working. It works on all flavors of Jupyter, including classic Jupyter, JupyterLab, and Google Colaboratory. It also works on local and remotely hosted setups like AWS SageMaker or JupyterHub!

Here’s an example of what it looks like in action:

Jupyter Demo

Once you’ve copied the DataFrame, pasting it into a spreadsheet is as easy as selecting a cell and hitting paste:

Excel Demo

Once it’s installed and set up, using it is as straightforward as wrapping your chosen DataFrame with replicator() to add a copy button to the output. The copy button widget also has a dropdown that lets you choose exactly what you want to copy from the DataFrame.

I’ve included some instructions below, but they’re also available in the readme.

Installation

Installation can be done through pip:

> pip install pd-replicator

ipywidgets must be set up in order for the button/dropdown to display correctly:

> pip install ipywidgets 
> jupyter nbextension enable --py widgetsnbextension

To use with JupyterLab, an additional step is required:

> jupyter labextension install @jupyter-widgets/jupyterlab-manager

Usage

Wrap replicator() around any pandas DataFrame/Series to display the replicator copy button above the DataFrame/Series:

from pd_replicator import replicator
replicator(df)

For remotely hosted instances, the native option should be set to False:

from pd_replicator import replicator
replicator(df, native=False)

This uses JavaScript to copy to the clipboard through your browser, rather than the system copy method used by pandas.to_clipboard().

Feedback

If you have any unresolvable problems with pd-replicator please feel free to create an issue on GitHub here!