ciberlandia.pt é um dos muitos servidores Mastodon independentes que podes utilizar para participar no fediverso.
Uma instância 🇵🇹 dedicada à tecnopolítica e também a tudo o resto. Regras e termos em ciberlandia.pt ❤️ 🏳️‍🌈

Estatísticas do servidor:

116
utilizadores activos

I'm shocked! that sets don't de-duplicate NaN values! (I guess it is because NaN has this crazy property that it is not equal to itself... NaN != NaN is True...)

>>> set((float('nan'), float('nan'), float('nan'))) 
{nan, nan, nan}
Público

@villares That's correct behavior for sets in any language.

Público

@chemoelectric the more you live...

Público

@villares That's true.

Público

@villares Nah, you just need to put it in a variable:

>>> a = float('nan')
>>> set((a,a,a))
{nan}
>>>

Público

@dabeaz hahahah, crazier still!

I was naively converting a series from a dataframe into a set to see if I could grasp what were the "unique entries" and stumbled on this NaN thing (because there were lots of NaN values there)

Público

@villares NaN is a whole universe of crazy.

Público

@dabeaz @villares

>>> set([float('nan')] * 3)
{nan}
>>> set(float('nan') for _ in range(3))
{nan, nan, nan}

In terms of identity and equality, nan seems on the opposite end of the spectrum of None.

Every None is identical, while every nan is unique.

Maybe instead of using object() as a unique sentinel value, we could use float('nan') 🤔🙊

Público

@treyhunner @dabeaz @villares

>>> set(float('nan') for _ in range(8)) | {'batman'}
{nan, nan, nan, nan, nan, nan, nan, nan, 'batman'}

Público

@dabeaz @villares this has to do with Python's "interning", done to preserve memory and efficiency. Rather than store multiple copies of static variables, all variables will point to the same set of values.

```
a = 42
b = 42
c = float('nan')
d = float('nan')

assert id(a) == id(b)
assert id(c) != id(d)
```

Interning doesn't happen for nan values because it relies on the fact that the two values are equal.

Público

@dabeaz @villares or any reference, like from the math or numpy libraries:

>>> from math import nan
>>> set((nan, nan, nan))
{nan}

Não listado

@villares thanks for this!

I'm still pretty new to Python, but I had just discovered yesterday that using "if x is not y" is functionally different from "if x != y" and that was throwing me off (!= compares values, where "is not" compares if its the same object in memory)

Público
@Alexandre B A Villares 🐍 if you give it a minute of thought it all makes perfect sense.

Nan is a palceholder for something the computer can't represent (not a number). Two independent things I can't represent are not the same thing: `set([float('nan'), float('nan')])`. Two of the same thing, whether I can represent it or not, are the same thing: `x = float('nan'); set([x, x])`.
Público

@aleabdo I know it makes sense, my shock is that I hadn't found out about it earlier.

I was trying to grasp the number of unique values in a "series" (column) containing strings from a GeoDataFrame (from osmnx), I converted it to a set and was surprised by the result (there where many NaN entries).