Skip to content

Add back getattr for ExtensionArrays #10278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented May 1, 2025

@ilan-gold
Copy link
Contributor

  1. (fix): remove _getattr__ method for PandasExtensionArray #10250 (comment) should be considered here i.e., using __getattribute__ like I had had before the removal
  2. @martinfleis PandasExtensionArray was also removed from the public API, but getattr was just a pass-through to the underlying array, which is now used in e.g., Variable.data instead of the PandasExtensionArray wrapper. So I would caution being 100% the two are linked.

@martinfleis
Copy link
Contributor

I did not have time to pinpoint a specific commit but it has to be one of the commits merged in last 7 days, where this one felt like the only sensible option. But I might be wrong.

@ilan-gold
Copy link
Contributor

@martinfleis I will have a look then! @dcherian What is xarray's breaking change policy? I was under the impression that there was none given the date-versioning, but should there be stronger guarantees?

@dcherian
Copy link
Contributor Author

dcherian commented May 1, 2025

We try hard to not break things. Though in. this case, we're still figuring out how to handle extension arrays.

Seems like there are two options:

  1. add back the passthrough
  2. xvec extracts the underlying array.

I lean toward one, since that makes sense for a wrapper. I don't see any way other than __getattribute__ sadly, i thought __getattr__ would just work.

@ilan-gold
Copy link
Contributor

@dcherian My caution/suspicion was warranted. I just checked out xvec and xarray before the removal of getattr. The test is still broken. What I did notice is that the dtype of df[c] on the breaking line is preserved now, which is more in line with the breaking change in: #9671. However, this is probably an upgrade for you @martinfleis - the xarray object now preserves the geometry data type because it was a coordinate and therefore under-the-hood, wrapped in an indexing adapter which #9671 addressed. So the conversion is likely no longer necessary.

@ilan-gold
Copy link
Contributor

@dcherian As for bringing back this feature, I am neutral on it because it is not in the public API. I liked your suggestion of removing it because it means that things are explicit now. But of course, if you want to bring it back, __getattribute__ is a must.

@martinfleis I will open a PR

@ilan-gold
Copy link
Contributor

@dcherian It would be great if we could merge this. We have lost the nbytes implementation so I think it might be only a matter of time until someone complains:

import geopandas, geodatasets

geopandas.read_file(geodatasets.get_path('geoda.malaria')).to_xarray()

@@ -142,3 +142,6 @@ def __array__(
return np.asarray(self.array, dtype=dtype, copy=copy)
else:
return np.asarray(self.array, dtype=dtype)

def __getattr__(self, attr: str) -> Any:
return getattr(self.array, attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return getattr(self.array, attr)
return getattr(super().__getattribute__("array"), attr)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants