{"id":4177,"date":"2023-07-13T02:02:15","date_gmt":"2023-07-13T02:02:15","guid":{"rendered":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/"},"modified":"2023-07-13T02:02:15","modified_gmt":"2023-07-13T02:02:15","slug":"membersihkan-data-di-sungai","status":"publish","type":"post","link":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/","title":{"rendered":"Cara melakukan pembersihan data di r (dengan contoh)"},"content":{"rendered":"<p><\/p>\n<hr>\n<p><span style=\"color: #000000;\"><strong>Pembersihan data<\/strong> mengacu pada proses mengubah <a href=\"https:\/\/statorials.org\/id\/data-mentah\/\" target=\"_blank\" rel=\"noopener\">data mentah<\/a> menjadi data yang sesuai untuk analisis atau pembuatan model.<\/span><\/p>\n<p> <span style=\"color: #000000;\">Dalam kebanyakan kasus, &#8220;pembersihan&#8221; kumpulan data melibatkan penanganan nilai yang hilang dan data duplikat.<\/span><\/p>\n<p> <span style=\"color: #000000;\">Berikut adalah metode paling umum untuk &#8220;membersihkan&#8221; kumpulan data di R:<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>Metode 1: Hapus baris dengan nilai yang hilang<\/strong><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #107d3f;\">library<\/span> (dplyr)\n\n<span style=\"color: #008080;\">#remove rows with any missing values\n<\/span>df %&gt;% na. <span style=\"color: #3366ff;\">omit<\/span> ()\n<\/strong><\/pre>\n<p> <span style=\"color: #000000;\"><strong>Metode 2: Ganti nilai yang hilang dengan nilai lain<\/strong><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #107d3f;\">library<\/span> (dplyr)\n<span style=\"color: #008000;\">library<\/span> (tidyr)\n\n<span style=\"color: #008080;\">#replace missing values in each numeric column with median value of column\n<\/span>df %&gt;% mutate(across(where(is. <span style=\"color: #3366ff;\">numeric<\/span> ), ~replace_na(., median(., na. <span style=\"color: #3366ff;\">rm<\/span> = <span style=\"color: #008000;\">TRUE<\/span> ))))\n<\/strong><\/pre>\n<p> <span style=\"color: #000000;\"><strong>Metode 3: Hapus Baris Duplikat<\/strong><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #107d3f;\">library<\/span> (dplyr)\n\ndf %&gt;% distinct(. <span style=\"color: #3366ff;\">keep_all<\/span> = <span style=\"color: #008000;\">TRUE<\/span> )\n<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Contoh berikut menunjukkan cara menggunakan masing-masing metode ini dalam praktik dengan kerangka data berikut di R yang berisi informasi tentang berbagai pemain bola basket:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008080;\">#create data frame\n<\/span>df &lt;- data. <span style=\"color: #3366ff;\">frame<\/span> (team=c('A', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'),\n                 points=c(4, 4, NA, 8, 6, 12, 14, 86, 13, 8),\n                 rebounds=c(9, 9, 7, 6, 8, NA, 9, 14, 12, 11),\n                 assists=c(2, 2, NA, 7, 6, 6, 9, 10, NA, 14))\n\n<span style=\"color: #008080;\">#view data frame\n<\/span>df\n\n   team points rebound assists\n1 to 4 9 2\n2 to 4 9 2\n3 B NA 7 NA\n4 C 8 6 7\n5 D 6 8 6\n6 E 12 NA 6\n7 F 14 9 9\n8 G 86 14 10\n9:13:12 NA\n10 I 8 11 14\n<\/strong><\/pre>\n<h2> <span style=\"color: #000000;\"><strong>Contoh 1: Hapus baris dengan nilai yang hilang<\/strong><\/span><\/h2>\n<p> <span style=\"color: #000000;\">Kita dapat menggunakan sintaks berikut untuk menghapus baris dengan nilai yang hilang di kolom mana pun:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">library<\/span> (dplyr)\n\n<span style=\"color: #008080;\">#remove rows with missing values\n<\/span>new_df &lt;- df %&gt;% na. <span style=\"color: #3366ff;\">omit<\/span> ()\n\n<span style=\"color: #008080;\">#view new data frame\n<\/span>new_df\n\n   team points rebound assists\n1 to 4 9 2\n2 to 4 9 2\n4 C 8 6 7\n5 D 6 8 6\n7 F 14 9 9\n8 G 86 14 10\n10 I 8 11 14<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa bingkai data baru tidak berisi baris apa pun dengan nilai yang hilang.<\/span><\/p>\n<h2> <span style=\"color: #000000;\"><strong>Contoh 2: Ganti nilai yang hilang dengan nilai lain<\/strong><\/span><\/h2>\n<p> <span style=\"color: #000000;\">Kita dapat menggunakan sintaks berikut untuk mengganti nilai yang hilang dengan nilai median setiap kolom:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">library<\/span> (dplyr)\n<span style=\"color: #008000;\">library<\/span> (tidyr)\n\n<span style=\"color: #008080;\">#replace missing values in each numeric column with median value of column\n<\/span>new_df &lt;-df %&gt;% mutate(across(where(is. <span style=\"color: #3366ff;\">numeric<\/span> ),~replace_na(.,median(.,na. <span style=\"color: #3366ff;\">rm<\/span> = <span style=\"color: #008000;\">TRUE<\/span> )))) \n\n<span style=\"color: #008080;\">#view new data frame\n<\/span>new_df\n\n   team points rebound assists\n1 to 4 9 2.0\n2 to 4 9 2.0\n3 B 8 7 6.5\n4 C 8 6 7.0\n5 D 6 8 6.0\n6 E 12 9 6.0\n7 F 14 9 9.0\n8 G 86 14 10.0\n9:13 12 6.5\n10 I 8 11 14.0<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa nilai yang hilang di setiap kolom numerik masing-masing telah diganti dengan nilai median kolom tersebut.<\/span><\/p>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa Anda juga dapat mengganti <strong>median<\/strong> dalam rumus dengan <strong>mean<\/strong> untuk mengganti nilai yang hilang dengan nilai mean setiap kolom.<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>Catatan<\/strong> : Kami juga harus memuat paket <strong>Tidyr<\/strong> dalam contoh ini karena fungsi <strong>drop_na()<\/strong> berasal dari paket tersebut.<\/span><\/p>\n<h2> <span style=\"color: #000000;\"><strong>Contoh 3: Hapus baris duplikat<\/strong><\/span><\/h2>\n<p> <span style=\"color: #000000;\">Kita dapat menggunakan sintaks berikut untuk mengganti nilai yang hilang dengan nilai median setiap kolom:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">library<\/span> (dplyr)\n\n<span style=\"color: #008080;\">#remove duplicate rows\n<\/span>new_df &lt;- df %&gt;% distinct(. <span style=\"color: #3366ff;\">keep_all<\/span> = <span style=\"color: #008000;\">TRUE<\/span> )\n\n<span style=\"color: #008080;\">#view new data frame\n<\/span>new_df\n\n  team points rebound assists\n1 to 4 9 2\n2 B NA 7 NA\n3 C 8 6 7\n4 D 6 8 6\n5 E 12 NA 6\n6 F 14 9 9\n7 G 86 14 10\n8:13:12 NA\n9 I 8 11 14<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa baris kedua telah dihapus dari bingkai data karena setiap nilai di baris kedua merupakan duplikat dari nilai di baris pertama.<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>Catatan<\/strong> : Anda dapat menemukan dokumentasi lengkap untuk fungsi dplyr <strong>berbeda()<\/strong> <a href=\"https:\/\/dplyr.tidyverse.org\/reference\/distinct.html\" target=\"_blank\" rel=\"noopener\">di sini<\/a> .<\/span><\/p>\n<h2> <span style=\"color: #000000;\"><strong>Sumber daya tambahan<\/strong><\/span><\/h2>\n<p> <span style=\"color: #000000;\">Tutorial berikut menjelaskan cara melakukan tugas umum lainnya di R:<\/span><\/p>\n<p> <a href=\"https:\/\/statorials.org\/id\/data-resume-grup-r\/\" target=\"_blank\" rel=\"noopener\">Cara mengelompokkan dan meringkas data di R<\/a><br \/> <a href=\"https:\/\/statorials.org\/id\/tabel-ringkasan-di-r\/\" target=\"_blank\" rel=\"noopener\">Cara membuat tabel ringkasan di R<\/a><br \/> <a href=\"https:\/\/statorials.org\/id\/drop_na-di-sungai\/\" target=\"_blank\" rel=\"noopener\">Cara menghapus baris dengan nilai yang hilang di R<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pembersihan data mengacu pada proses mengubah data mentah menjadi data yang sesuai untuk analisis atau pembuatan model. Dalam kebanyakan kasus, &#8220;pembersihan&#8221; kumpulan data melibatkan penanganan nilai yang hilang dan data duplikat. Berikut adalah metode paling umum untuk &#8220;membersihkan&#8221; kumpulan data di R: Metode 1: Hapus baris dengan nilai yang hilang library (dplyr) #remove rows with [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials<\/title>\n<meta name=\"description\" content=\"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/\" \/>\n<meta property=\"og:locale\" content=\"id_ID\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials\" \/>\n<meta property=\"og:description\" content=\"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/\" \/>\n<meta property=\"og:site_name\" content=\"Statorials\" \/>\n<meta property=\"article:published_time\" content=\"2023-07-13T02:02:15+00:00\" \/>\n<meta name=\"author\" content=\"Benjamin anderson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Ditulis oleh\" \/>\n\t<meta name=\"twitter:data1\" content=\"Benjamin anderson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimasi waktu membaca\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 menit\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/\",\"url\":\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/\",\"name\":\"Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials\",\"isPartOf\":{\"@id\":\"https:\/\/statorials.org\/id\/#website\"},\"datePublished\":\"2023-07-13T02:02:15+00:00\",\"dateModified\":\"2023-07-13T02:02:15+00:00\",\"author\":{\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81\"},\"description\":\"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.\",\"breadcrumb\":{\"@id\":\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/#breadcrumb\"},\"inLanguage\":\"id\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/statorials.org\/id\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cara melakukan pembersihan data di r (dengan contoh)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/statorials.org\/id\/#website\",\"url\":\"https:\/\/statorials.org\/id\/\",\"name\":\"Statorials\",\"description\":\"Panduan anda untuk kompetensi statistik!\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/statorials.org\/id\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"id\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81\",\"name\":\"Benjamin anderson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"id\",\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg\",\"contentUrl\":\"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg\",\"caption\":\"Benjamin anderson\"},\"description\":\"Halo, saya Benjamin, pensiunan profesor statistika yang menjadi guru Statorial yang berdedikasi. Dengan pengalaman dan keahlian yang luas di bidang statistika, saya ingin berbagi ilmu untuk memberdayakan mahasiswa melalui Statorials. Baca selengkapnya\",\"sameAs\":[\"http:\/\/statorials.org\/id\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials","description":"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/","og_locale":"id_ID","og_type":"article","og_title":"Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials","og_description":"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.","og_url":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/","og_site_name":"Statorials","article_published_time":"2023-07-13T02:02:15+00:00","author":"Benjamin anderson","twitter_card":"summary_large_image","twitter_misc":{"Ditulis oleh":"Benjamin anderson","Estimasi waktu membaca":"3 menit"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/","url":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/","name":"Cara Melakukan Pembersihan Data di R (dengan Contoh) - Statorials","isPartOf":{"@id":"https:\/\/statorials.org\/id\/#website"},"datePublished":"2023-07-13T02:02:15+00:00","dateModified":"2023-07-13T02:02:15+00:00","author":{"@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81"},"description":"Tutorial ini menjelaskan cara melakukan pembersihan data pada dataset di R, dengan sebuah contoh.","breadcrumb":{"@id":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/#breadcrumb"},"inLanguage":"id","potentialAction":[{"@type":"ReadAction","target":["https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/statorials.org\/id\/membersihkan-data-di-sungai\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/statorials.org\/id\/"},{"@type":"ListItem","position":2,"name":"Cara melakukan pembersihan data di r (dengan contoh)"}]},{"@type":"WebSite","@id":"https:\/\/statorials.org\/id\/#website","url":"https:\/\/statorials.org\/id\/","name":"Statorials","description":"Panduan anda untuk kompetensi statistik!","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/statorials.org\/id\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"id"},{"@type":"Person","@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81","name":"Benjamin anderson","image":{"@type":"ImageObject","inLanguage":"id","@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/image\/","url":"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg","contentUrl":"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg","caption":"Benjamin anderson"},"description":"Halo, saya Benjamin, pensiunan profesor statistika yang menjadi guru Statorial yang berdedikasi. Dengan pengalaman dan keahlian yang luas di bidang statistika, saya ingin berbagi ilmu untuk memberdayakan mahasiswa melalui Statorials. Baca selengkapnya","sameAs":["http:\/\/statorials.org\/id"]}]}},"yoast_meta":{"yoast_wpseo_title":"","yoast_wpseo_metadesc":"","yoast_wpseo_canonical":""},"_links":{"self":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts\/4177"}],"collection":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/comments?post=4177"}],"version-history":[{"count":0,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts\/4177\/revisions"}],"wp:attachment":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/media?parent=4177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/categories?post=4177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/tags?post=4177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}