Copy/Paste a DataFrame with pd-replicator
Originally published in Towards Data Science.
Data Science is extremely iterative. Those iterations can involve lots of back and forth of data between different platforms. Maybe you pull some data from your BI tool, run some analysis in Jupyter, then move the output into a spreadsheet to share with your teammates. Then you find something you need to fix and end up running the whole thing again.
The thing is, when you’re juggling this data around, you’re losing mental energy/health on something inherently boring. I built pd-replicator to help make it as simple as possible to move data out of Jupyter and into a spreadsheet, which I usually find to be the most painful part of data juggling.
pd-replicator adds a copy button to a DataFrame output in Jupyter, which lets you copy that DataFrame to your clipboard in a format that allows it to be pasted directly into a spreadsheet.
It’s a simple
pip package, and requires very minimal setup to get working. It works on all flavors of Jupyter, including classic Jupyter, JupyterLab, and Google Colaboratory. It also works on local and remotely hosted setups like AWS SageMaker or JupyterHub!
Here’s an example of what it looks like in action:
Once you’ve copied the DataFrame, pasting it into a spreadsheet is as easy as selecting a cell and hitting paste:
Once it’s installed and set up, using it is as straightforward as wrapping your chosen DataFrame with
replicator() to add a copy button to the output. The copy button widget also has a dropdown that lets you choose exactly what you want to copy from the DataFrame.
I’ve included some instructions below, but they’re also available in the readme.
Installation can be done through
> pip install pd-replicator
ipywidgets must be set up in order for the button/dropdown to display correctly:
> pip install ipywidgets > jupyter nbextension enable --py widgetsnbextension
To use with JupyterLab, an additional step is required:
> jupyter labextension install @jupyter-widgets/jupyterlab-manager
replicator() around any pandas DataFrame/Series to display the replicator copy button above the DataFrame/Series:
from pd_replicator import replicator replicator(df)
For remotely hosted instances, the
native option should be set to
from pd_replicator import replicator replicator(df, native=False)
If you have any unresolvable problems with pd-replicator please feel free to create an issue on GitHub here!