@pandas_dev Profile picture

pandas

@pandas_dev

Official account of the pandas project

Similar User
NumPy photo

@numpy_team

Project Jupyter photo

@ProjectJupyter

Scientific Python photo

@SciPyTip

Colaboratory photo

@GoogleColab

Matplotlib photo

@matplotlib

PyData photo

@PyData

PyTorch photo

@PyTorch

Kaggle photo

@kaggle

Guido van Rossum photo

@gvanrossum

Streamlit photo

@streamlit

PyCoder’s Weekly photo

@pycoders

Anaconda photo

@anacondainc

Python Software Foundation photo

@ThePSF

TensorFlow photo

@TensorFlow

Andreas Mueller (also at mastodon) photo

@amuellerml

Hey we're the fastest at writing 2300 Parquet files, we can do it in 0 minutes! Oh, wait

#duckdb parquet writer is embarrassingly fast.

Tweet Image 1


pandas Reposted

For a second I thought this headline was about @pandas_dev and was like HELLSSS YEAH! sfstandard.com/2024/06/12/san…

Tweet Image 1

We're happy to announce the release of #pandas 2.2.2. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. Thanks to all contributors and sponsors who made this release possible! The release notes can be found at: pandas.pydata.org/docs/whatsnew/…


We're happy to announce the release of #pandas 2.2.1. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. Thanks to all contributors and sponsors who made this release possible! The release notes can be found at: pandas.pydata.org/docs/whatsnew/…


pandas Reposted

How fast can a CSV file be processed? I explain in detail comparing many options such as @pandas_dev, @duckdb, @DataPolars, #Python, #R, #rustlang and more in this new blog post: datapythonista.me/blog/how-fast-…


We're happy to announce the release of #pandas 2.2.0. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. Thanks to all contributors and sponsors who made this release possible! The release notes can be found at: pandas.pydata.org/docs/whatsnew/…


We are excited to announce a release candidate for #pandas 2.2.0 has just been released. If all goes well, we'll release #pandas 2.2.0 in about 2 weeks. Full list of changes and contributors: pandas.pydata.org/docs/dev/whats…


We're happy to announce the release of #pandas 2.1.4. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. Thanks to all contributors and sponsors who made this release possible! The release notes can be found at: pandas.pydata.org/docs/whatsnew/…


pandas Reposted

A new edition of the data manipulation, analysis and visualization in #python course at Ghent University by @jorisvdbossche and myself is scheduled on 29, 31 January and 2 February 2024. For more information, see shorturl.at/qDVW9 #pandas #seaborn #matplotlib


We're happy to announce the release of #pandas 2.1.2. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. Thanks to all contributors and sponsors who made this release possible! The release notes can be found at: pandas.pydata.org/docs/whatsnew/…


We're happy to announce the release of #pandas 2.1.1. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. You can find what's new in this version in the release notes. Thanks to all contributors and sponsors who made this release possible!


Can #pandas be lazy? There has been some discussion and a proof of concept about it recently.


pandas Reposted

I can't answer that: read the source code of every project you find interesting! For me I was a @pandas_dev user, wanted to improve IO support with Stata files, and dug into the code to figure out how it worked. My first pr 🥹 github.com/pandas-dev/pan…


This is the (much more efficient) workaround which you're encouraged to use instead - nice one @CaioLCastro ! Use `concat` a single time outside the loop, rather than multiple times inside it

we use a hybrid approach. Append to a list, and then concat. res = [] for x in itersomething: res.append(calculations) pd.concat(res) df.append is gruesomely inefficient so maybe it is best to remove



#pandas has two internal ways to store strings: NumPy and PyArrow (faster). pandas 3.0 will change the default and strings will use PyArrow when for example calling read_csv. You can get this change now in pandas 2.1 with: pandas.options.future.infer_string = True


We're happy to announce the release of #pandas 2.1.0. You can install it with `pip install pandas` or `mamba install -c conda-forge pandas`. You can find what's new in this version in the release notes. Thanks to all contributors and sponsors who made this release possible!


Do you know how to extend #pandas with a fast language like #rustlang? Core developer @datapythonista shows you how in this step by step tutorial at @EuroSciPy m.youtube.com/watch?v=iUEzNm…


Do you want to learn more about #pandas 2.0 and beyond? Core developers @jorisvdbossche and Richard Shadrach gave a talk about it at @EuroSciPy youtu.be/NK7RuG4rQpI


We are better than SQL. Except when SQL is better.

am i the only one who likes both pandas and SQL



Artificial intelligence may not be so intelligent if it uses pandas .apply() when not strictly necessary. Our operations are usually vectorized (very fast), .apply() is usually not, so it may be very slow. Avoid loops and apply if a pandas operation exists for what you need.

I love GPT-4 code assistant but it uses .apply() for every bit of pandas code and I ain't about it



Loading...

Something went wrong.


Something went wrong.