วิธีลบแถวที่ซ้ำกันใน r (พร้อมตัวอย่าง)

โดย ดร.เบนจามิน แอนเดอร์สัน กรกฎาคม 23, 2023 แนะนำ 0 ความคิดเห็น

คุณสามารถใช้หนึ่งในสองวิธีเพื่อลบแถวที่ซ้ำกันออกจากกรอบข้อมูลใน R:

วิธีที่ 1: ใช้ Base R

 #remove duplicate rows across entire data frame
df[ ! duplicated(df), ]

#remove duplicate rows across specific columns of data frame
df[ ! duplicated(df[c(' var1 ')]), ]

วิธีที่ 2: ใช้ dplyr

 #remove duplicate rows across entire data frame 
df %>%
  distinct(.keep_all = TRUE )

#remove duplicate rows across specific columns of data frame
df %>%
  distinct(var1, .keep_all = TRUE )

ตัวอย่างต่อไปนี้แสดงวิธีใช้ไวยากรณ์นี้ในทางปฏิบัติกับกรอบข้อมูลต่อไปนี้:

 #define data frame
df <- data. frame (team=c('A', 'A', 'A', 'B', 'B', 'B'),
                 position=c('Guard', 'Guard', 'Forward', 'Guard', 'Center', 'Center'))

#view data frame
df

  team position
1A Guard
2 A Guard
3 A Forward
4 B Guard
5B Center
6B Center

ตัวอย่างที่ 1: ลบแถวที่ซ้ำกันโดยใช้ Base R

รหัสต่อไปนี้แสดงวิธีการลบแถวที่ซ้ำกันออกจากกรอบข้อมูลโดยใช้ฟังก์ชันฐาน R:

 #remove duplicate rows from data frame
df[ ! duplicated(df), ]

  team position
1A Guard
3 A Forward
4 B Guard
5B Center

รหัสต่อไปนี้แสดงวิธีการลบแถวที่ซ้ำกันออกจากคอลัมน์เฉพาะในกรอบข้อมูลโดยใช้ฐาน R:

 #remove rows where there are duplicates in the 'team' column
df[ ! duplicated(df[c(' team ')]), ]

  team position
1A Guard
4 B Guard

ตัวอย่างที่ 2: ลบแถวที่ซ้ำกันโดยใช้ dplyr

รหัสต่อไปนี้แสดงวิธีการลบแถวที่ซ้ำกันออกจากกรอบข้อมูลโดยใช้ฟังก์ชัน ที่แตกต่างกัน () จากแพ็คเกจ dplyr :

 library (dplyr)

#remove duplicate rows from data frame
df %>%
  distinct(.keep_all = TRUE )

  team position
1A Guard
2 A Forward
3 B Guard
4B Center

โปรดทราบว่าอาร์กิวเมนต์ .keep_all บอกให้ R เก็บคอลัมน์ทั้งหมดไว้ในกรอบข้อมูลดั้งเดิม

รหัสต่อไปนี้แสดงวิธีการใช้ฟังก์ชัน ที่แตกต่าง() เพื่อลบแถวที่ซ้ำกันออกจากคอลัมน์เฉพาะในกรอบข้อมูล:

 library (dplyr)

#remove duplicate rows from data frame
df %>%
  distinct(team, .keep_all = TRUE )

  team position
1A Guard
2 B Guard

แหล่งข้อมูลเพิ่มเติม

บทช่วยสอนต่อไปนี้จะอธิบายวิธีการใช้งานฟังก์ชันทั่วไปอื่นๆ ใน R:

วิธีลบแถวใน R ตามเงื่อนไข
วิธีลบแถวที่มี NA ในคอลัมน์เฉพาะใน R

เกี่ยวกับผู้แต่ง

ดร.เบนจามิน แอนเดอร์สัน

สวัสดี ฉันชื่อเบนจามิน ศาสตราจารย์สถิติเกษียณอายุแล้ว และผันตัวมาเป็นครูสอนสถิติโดยเฉพาะ ด้วยประสบการณ์และความเชี่ยวชาญที่กว้างขวางในสาขาสถิติ ฉันกระตือรือร้นที่จะแบ่งปันความรู้ของฉันเพื่อเสริมศักยภาพนักเรียนผ่าน Statorials. รู้เพิ่มเติม

ตัวอย่างที่ 1: ลบแถวที่ซ้ำกันโดยใช้ Base R

ตัวอย่างที่ 2: ลบแถวที่ซ้ำกันโดยใช้ dplyr

แหล่งข้อมูลเพิ่มเติม

เกี่ยวกับผู้แต่ง

ดร.เบนจามิน แอนเดอร์สัน

เพิ่มความคิดเห็น