国产在线看片无码不卡群交,久久男人AV资源网站无码软件 ,一区在线看

主頁 > 知識(shí)庫 > pandas按條件篩選數(shù)據(jù)的實(shí)現(xiàn)

pandas按條件篩選數(shù)據(jù)的實(shí)現(xiàn)

pandas中對(duì)DataFrame篩選數(shù)據(jù)的方法有很多的，以后會(huì)后續(xù)進(jìn)行補(bǔ)充，這里只整理遇到錯(cuò)誤的情況。

1.使用布爾型DataFrame對(duì)數(shù)據(jù)進(jìn)行篩選

使用一個(gè)條件對(duì)數(shù)據(jù)進(jìn)行篩選，代碼類似如下：

num_red=flags[flags['red']==1]

使用多個(gè)條件對(duì)數(shù)據(jù)進(jìn)行篩選，代碼類似如下：

stripes_or_bars=flags[(flags['stripes']>=1) | (flags['bars']>=1)]

常見的錯(cuò)誤代碼如下：

代碼一：

stripes_or_bars=flags[flags['stripes']>=1 or flags['bars']>=1]

代碼二：

stripes_or_bars=flags[flags['stripes']>=1 | flags['bars']>=1].

代碼三：

stripes_or_bars=flags[(flags['stripes']>=1) or (flags['bars']>=1)]

以上這三種代碼的錯(cuò)誤提示都是：ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 中括號(hào)里面的邏輯式如何解析的暫時(shí)不清楚。貌似不能使用and、or及not。

除了使用組合的邏輯表達(dá)式之外，使用返回類型為布爾型值的函數(shù)也可以達(dá)到篩選數(shù)據(jù)的效果。示例如下：

import pandas as pd
import numpy as np
df=pd.DataFrame(np.array(range(10)).reshape((5,-1)))
df.columns=['0','1']
df=df[df['1'].isin([3,5,9])]

其df的結(jié)果如下：

2.iloc()方法、ix()方法和iloc()方法的區(qū)別

首先dataframe一般有兩種類型的索引：第一種是位置索引，即dataframe自帶的從0開始的索引，這種索引叫位置索引。另一種即標(biāo)簽索引，這種索引是你在創(chuàng)建datafram時(shí)通過index關(guān)鍵字，或者通過其他index相關(guān)方法重新給dataframe設(shè)置的索引。這兩種索引是同時(shí)存在的。一般設(shè)置了標(biāo)簽索引之后，就不在顯示位置索引，但不意味著位置索引就不存在了。

假設(shè)有如下幾行數(shù)據(jù)(截圖部分只是數(shù)據(jù)的一部分），很明顯，以下顯示的索引為標(biāo)簽索引。同時(shí)574(標(biāo)簽索引)行對(duì)應(yīng)的位置索引則為0，1593行對(duì)應(yīng)的位置索引為2，以此類推。

先來看loc(),其API網(wǎng)址http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.htm，函數(shù)名下方有一行解釋，Access a group of rows and columns by label(s) or a boolean array.. loc[] is primarily label based, but may also be used with a boolean array.

代碼一：

first_listing = normalized_listings.loc[[0,4]]

結(jié)果如下，可以看出其輸出的是dataframe中標(biāo)簽索引為0和4的兩行數(shù)據(jù)。注意，如果標(biāo)簽索引的類型為字符串，則在loc中也要用字符串的形式。

再來看iloc(),其API網(wǎng)址http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html，函數(shù)名下方的解釋為 Purely integer-location based indexing for selection by position. .iloc[] is primarily integer position based ( from 0 to length-1 of the axis), but may also be used with a boolean array.

代碼二:

first_listing = normalized_listings.iloc[[0,4]]

結(jié)果如下，可以看出其輸出的dataframe中第0行和第4行的數(shù)據(jù)，即按方法是按照位置索引取得數(shù)。注意使用位置索引的時(shí)候只能用整數(shù)（integer position,bool類型除外）

另外，還可以向loc和iloc中傳入bool序列，這樣就可以將前面介紹的boo表達(dá)式用到loc和iloc中。下面來看看怎么使用bool序列？

import pandas as pd
data=pd.DataFrame(data={'col1':[1,2,3,5,10],'col2':[50,90,67,75,100]},\

         index=['a','b','c','d','e'])
print(data)
#iloc[]示例,iloc似乎不能直接使用邏輯表達(dá)式的結(jié)果，我這里將其轉(zhuǎn)置成list之后就可以用了，原因暫且不明
data_1=data.iloc[list(data['col1']>5)]
print(data_1)
#loc[]示例，loc中可以直接使用邏輯表達(dá)式
data_2=data.loc[data['col1']>5]
print(data_2)

在iloc[]中，如果直接使用loc中的邏輯表達(dá)式而不進(jìn)行l(wèi)ist()轉(zhuǎn)化的話，會(huì)提示ValueError: iLocation based boolean indexing cannot use an indexable as a mask錯(cuò)誤。

如果查看上述兩段代碼中得到的first_listing。我們會(huì)發(fā)現(xiàn)兩處first_listing的類型均為datafrarm。loc和iloc除了能對(duì)行進(jìn)行篩選，還可以篩選列。如果在loc和iloc中設(shè)定了對(duì)列的篩選，則篩選之后得到的數(shù)據(jù)可能是datafrme類型，也有可能是Series類型。下面直接以代碼運(yùn)行結(jié)果進(jìn)行說明。

import pandas as pd
data=pd.DataFrame(data={'col1':[1,2,3,5,10],'col2':[50,90,67,75,100]},\

         index=['a','b','c','d','e'])
print(data)
#iloc[]示例 ,在使用iloc的時(shí)候，[]里面無論是篩選行還是篩選列，都只能使用數(shù)字形式的行號(hào)或列號(hào)。
#這里如果使用‘col2',這里會(huì)報(bào)錯(cuò)
data_1=data.iloc[[0,4],[1]]#當(dāng)需要篩選出多列或者希望返回的結(jié)果為DataFrame時(shí)，可以將列號(hào)用[]括起來。
print(data_1)
print(type(data_1))
data_2=data.iloc[[0,4],1]#當(dāng)只需要篩選出其中的一列時(shí)可以只寫一個(gè)列號(hào)，不加中括號(hào)，這種方法得到的是一個(gè)Series
print(data_2)
print(type(data_2))
#loc[]示例
data_3=data.loc[['a','e'],['col2']]
print(data_3)
print(type(data_3))
data_4=data.loc[['a','e'],'col2']
print(data_4)
print(type(data_4))

具體的代碼執(zhí)行結(jié)果如下：

最后看ix()方法，其API網(wǎng)址http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.ix.html,其解釋為 A primarily label-location based indexer, with integer position fallback.

代碼三：

first_listing = normalized_listings.ix[[0,4]]

結(jié)果如下似乎與loc（）方法的結(jié)果是相同的，但是從其給出的解釋來看，其好像是前兩個(gè)方法的集合。

到此這篇關(guān)于pandas按條件篩選數(shù)據(jù)的實(shí)現(xiàn)的文章就介紹到這了,更多相關(guān)pandas 條件篩選內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

pandas 按日期范圍篩選數(shù)據(jù)的實(shí)現(xiàn)
使用pandas實(shí)現(xiàn)篩選出指定列值所對(duì)應(yīng)的行
使用pandas庫對(duì)csv文件進(jìn)行篩選保存
pandas條件組合篩選和按范圍篩選的示例代碼
使用Pandas對(duì)數(shù)據(jù)進(jìn)行篩選和排序的實(shí)現(xiàn)
Pandas 如何篩選包含特定字符的列

標(biāo)簽：郴州烏蘭察布大慶烏蘭察布海南平頂山合肥哈爾濱

巨人網(wǎng)絡(luò)通訊聲明：本文標(biāo)題《pandas按條件篩選數(shù)據(jù)的實(shí)現(xiàn)》，本文關(guān)鍵詞 pandas,按,條件,篩選,數(shù)據(jù),；如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題，煩請(qǐng)?zhí)峁┫嚓P(guān)信息告之我們，我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò)，涉及言論、版權(quán)與本站無關(guān)。