{"id":3121,"date":"2023-07-19T03:04:16","date_gmt":"2023-07-19T03:04:16","guid":{"rendered":"https:\/\/statorials.org\/id\/tes-kereta-panda\/"},"modified":"2023-07-19T03:04:16","modified_gmt":"2023-07-19T03:04:16","slug":"tes-kereta-panda","status":"publish","type":"post","link":"https:\/\/statorials.org\/id\/tes-kereta-panda\/","title":{"rendered":"Cara membuat set kereta dan pengujian dari pandas dataframe"},"content":{"rendered":"<p><\/p>\n<hr>\n<p><span style=\"color: #000000;\">Saat menyesuaikan model pembelajaran mesin ke kumpulan data, kami sering membagi kumpulan data menjadi dua kumpulan:<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>1. Training set:<\/strong> digunakan untuk melatih model (70-80% dari dataset asli)<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>2. Test set:<\/strong> digunakan untuk mendapatkan estimasi performa model yang tidak bias (20-30% dari dataset asli)<\/span><\/p>\n<p> <span style=\"color: #000000;\">Di Python, ada dua cara umum untuk membagi pandas DataFrame menjadi set pelatihan dan set pengujian:<\/span><\/p>\n<p> <span style=\"color: #000000;\"><strong>Metode 1: Gunakan train_test_split() sklearn<\/strong><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">from<\/span> sklearn. <span style=\"color: #3366ff;\">model_selection<\/span> <span style=\"color: #008000;\">import<\/span> train_test_split\n\ntrain, test = train_test_split(df, test_size= <span style=\"color: #008000;\">0.2<\/span> , random_state= <span style=\"color: #008000;\">0<\/span> )<\/strong><\/pre>\n<p> <span style=\"color: #000000;\"><strong>Metode 2: gunakan sample() dari panda<\/strong><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong>train = df. <span style=\"color: #3366ff;\">sample<\/span> (frac= <span style=\"color: #008000;\">0.8<\/span> , random_state= <span style=\"color: #008000;\">0<\/span> )\ntest = df. <span style=\"color: #3366ff;\">drop<\/span> ( <span style=\"color: #3366ff;\">train.index<\/span> )<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Contoh berikut menunjukkan cara menggunakan setiap metode dengan pandas DataFrame berikut:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">import<\/span> pandas <span style=\"color: #008000;\">as<\/span> pd\n<span style=\"color: #008000;\">import<\/span> numpy <span style=\"color: #008000;\">as<\/span> np\n\n<span style=\"color: #008080;\">#make this example reproducible\n<\/span>n.p. <span style=\"color: #3366ff;\">random<\/span> . <span style=\"color: #3366ff;\">seeds<\/span> (1)\n\n<span style=\"color: #008080;\">#create DataFrame with 1,000 rows and 3 columns\n<\/span>df = pd. <span style=\"color: #3366ff;\">DataFrame<\/span> <span style=\"color: #3366ff;\">(<\/span> {' <span style=\"color: #ff0000;\">x1<\/span> ': <span style=\"color: #3366ff;\">np.random.randint<\/span> (30,size=1000),\n                   ' <span style=\"color: #ff0000;\">x2<\/span> ': np. <span style=\"color: #3366ff;\">random<\/span> . <span style=\"color: #3366ff;\">randint<\/span> (12, size=1000),\n                   ' <span style=\"color: #ff0000;\">y<\/span> ': np. <span style=\"color: #3366ff;\">random<\/span> . <span style=\"color: #3366ff;\">randint<\/span> (2, size=1000)})\n\n<span style=\"color: #008080;\">#view first few rows of DataFrame<\/span>\ndf. <span style=\"color: #3366ff;\">head<\/span> ()\n\n        x1 x2 y\n0 5 1 1\n1 11 8 0\n2 12 4 1\n3 8 7 0\n4 9 0 0\n<\/strong><\/pre>\n<h3> <span style=\"color: #000000;\"><strong>Contoh 1: gunakan train_test_split() dari sklearn<\/strong><\/span><\/h3>\n<p> <span style=\"color: #000000;\"><span style=\"color: #000000;\">Kode berikut menunjukkan cara menggunakan fungsi <strong>sklearn<\/strong> <strong>train_test_split()<\/strong> untuk membagi panda DataFrame menjadi set pelatihan dan pengujian:<\/span><\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008000;\">from<\/span> sklearn. <span style=\"color: #3366ff;\">model_selection<\/span> <span style=\"color: #008000;\">import<\/span> train_test_split\n\n<span style=\"color: #008080;\">#split original DataFrame into training and testing sets\n<\/span>train, test = train_test_split(df, test_size= <span style=\"color: #008000;\">0.2<\/span> , random_state= <span style=\"color: #008000;\">0<\/span> )\n\n<span style=\"color: #008080;\">#view first few rows of each set<\/span>\n<span style=\"color: #008000;\">print<\/span> ( <span style=\"color: #3366ff;\">train.head<\/span> ())\n\n     x1 x2 y\n687 16 2 0\n500 18 2 1\n332 4 10 1\n979 2 8 1\n817 11 1 0\n\n<span style=\"color: #008000;\">print<\/span> ( <span style=\"color: #3366ff;\">test.head<\/span> ())\n\n     x1 x2 y\n993 22 1 1\n859 27 6 0\n298 27 8 1\n553 20 6 0\n672 9 2 1\n\n<span style=\"color: #008080;\">#print size of each set<\/span>\n<span style=\"color: #008000;\">print<\/span> (train. <span style=\"color: #3366ff;\">shape<\/span> , test. <span style=\"color: #3366ff;\">shape<\/span> )\n\n(800, 3) (200, 3)\n<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Dari hasilnya kita dapat melihat bahwa dua set telah dibuat:<\/span><\/p>\n<ul>\n<li> <span style=\"color: #000000;\">Set pelatihan: 800 baris dan 3 kolom<\/span><\/li>\n<li> <span style=\"color: #000000;\">Set pengujian: 200 baris dan 3 kolom<\/span><\/li>\n<\/ul>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa <strong>test_size<\/strong> mengontrol persentase observasi dari DataFrame asli yang akan menjadi bagian dari set pengujian dan nilai <strong>random_state<\/strong> membuat pemisahan dapat direproduksi.<\/span><\/p>\n<h3> <span style=\"color: #000000;\"><strong>Contoh 2: Gunakan sample() dari pandas<\/strong><\/span><\/h3>\n<p> <span style=\"color: #000000;\">Kode berikut menunjukkan cara menggunakan fungsi <b>pandas<\/b> <strong>sample()<\/strong> untuk membagi pandas DataFrame menjadi set pelatihan dan pengujian:<\/span><\/p>\n<pre style=\"background-color: #ececec; font-size: 15px;\"> <strong><span style=\"color: #008080;\">#split original DataFrame into training and testing sets\n<\/span>train = df. <span style=\"color: #3366ff;\">sample<\/span> (frac= <span style=\"color: #008000;\">0.8<\/span> , random_state= <span style=\"color: #008000;\">0<\/span> )\ntest = df. <span style=\"color: #3366ff;\">drop<\/span> ( <span style=\"color: #3366ff;\">train.index<\/span> )\n\n<span style=\"color: #008080;\">#view first few rows of each set<\/span>\n<span style=\"color: #008000;\">print<\/span> ( <span style=\"color: #3366ff;\">train.head<\/span> ())\n\n     x1 x2 y\n993 22 1 1\n859 27 6 0\n298 27 8 1\n553 20 6 0\n672 9 2 1\n\n<span style=\"color: #008000;\">print<\/span> ( <span style=\"color: #3366ff;\">test.head<\/span> ())\n\n    x1 x2 y\n9 16 5 0\n11 12 10 0\n19 5 9 0\n23 28 1 1\n28 18 0 1\n\n<span style=\"color: #008080;\">#print size of each set<\/span>\n<span style=\"color: #008000;\">print<\/span> (train. <span style=\"color: #3366ff;\">shape<\/span> , test. <span style=\"color: #3366ff;\">shape<\/span> )\n\n(800, 3) (200, 3)\n<\/strong><\/pre>\n<p> <span style=\"color: #000000;\">Dari hasilnya kita dapat melihat bahwa dua set telah dibuat:<\/span><\/p>\n<ul>\n<li> <span style=\"color: #000000;\">Set pelatihan: 800 baris dan 3 kolom<\/span><\/li>\n<li> <span style=\"color: #000000;\">Set pengujian: 200 baris dan 3 kolom<\/span><\/li>\n<\/ul>\n<p> <span style=\"color: #000000;\">Perhatikan bahwa <b>frac<\/b> mengontrol persentase observasi dari DataFrame asli yang akan menjadi bagian dari set pelatihan dan nilai <strong>random_state<\/strong> membuat pemisahan dapat direproduksi.<\/span><\/p>\n<h3> <span style=\"color: #000000;\"><strong>Sumber daya tambahan<\/strong><\/span><\/h3>\n<p> <span style=\"color: #000000;\">Tutorial berikut menjelaskan cara melakukan tugas umum lainnya dengan Python:<\/span><\/p>\n<p> <a href=\"https:\/\/statorials.org\/id\/python-regresi-logistik\/\" target=\"_blank\" rel=\"noopener\">Cara Melakukan Regresi Logistik dengan Python<\/a><br \/> <a href=\"https:\/\/statorials.org\/id\/kebingungan-matriks-python\/\" target=\"_blank\" rel=\"noopener\">Cara Membuat Matriks Kebingungan dengan Python<\/a><br \/> <a href=\"https:\/\/statorials.org\/id\/sklearn-python-presisi-seimbang\/\">Cara menghitung presisi seimbang dengan Python<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Saat menyesuaikan model pembelajaran mesin ke kumpulan data, kami sering membagi kumpulan data menjadi dua kumpulan: 1. Training set: digunakan untuk melatih model (70-80% dari dataset asli) 2. Test set: digunakan untuk mendapatkan estimasi performa model yang tidak bias (20-30% dari dataset asli) Di Python, ada dua cara umum untuk membagi pandas DataFrame menjadi set [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials<\/title>\n<meta name=\"description\" content=\"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/statorials.org\/id\/tes-kereta-panda\/\" \/>\n<meta property=\"og:locale\" content=\"id_ID\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials\" \/>\n<meta property=\"og:description\" content=\"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/statorials.org\/id\/tes-kereta-panda\/\" \/>\n<meta property=\"og:site_name\" content=\"Statorials\" \/>\n<meta property=\"article:published_time\" content=\"2023-07-19T03:04:16+00:00\" \/>\n<meta name=\"author\" content=\"Benjamin anderson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Ditulis oleh\" \/>\n\t<meta name=\"twitter:data1\" content=\"Benjamin anderson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimasi waktu membaca\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 menit\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/statorials.org\/id\/tes-kereta-panda\/\",\"url\":\"https:\/\/statorials.org\/id\/tes-kereta-panda\/\",\"name\":\"Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials\",\"isPartOf\":{\"@id\":\"https:\/\/statorials.org\/id\/#website\"},\"datePublished\":\"2023-07-19T03:04:16+00:00\",\"dateModified\":\"2023-07-19T03:04:16+00:00\",\"author\":{\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81\"},\"description\":\"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.\",\"breadcrumb\":{\"@id\":\"https:\/\/statorials.org\/id\/tes-kereta-panda\/#breadcrumb\"},\"inLanguage\":\"id\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/statorials.org\/id\/tes-kereta-panda\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/statorials.org\/id\/tes-kereta-panda\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/statorials.org\/id\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cara membuat set kereta dan pengujian dari pandas dataframe\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/statorials.org\/id\/#website\",\"url\":\"https:\/\/statorials.org\/id\/\",\"name\":\"Statorials\",\"description\":\"Panduan anda untuk kompetensi statistik!\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/statorials.org\/id\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"id\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81\",\"name\":\"Benjamin anderson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"id\",\"@id\":\"https:\/\/statorials.org\/id\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg\",\"contentUrl\":\"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg\",\"caption\":\"Benjamin anderson\"},\"description\":\"Halo, saya Benjamin, pensiunan profesor statistika yang menjadi guru Statorial yang berdedikasi. Dengan pengalaman dan keahlian yang luas di bidang statistika, saya ingin berbagi ilmu untuk memberdayakan mahasiswa melalui Statorials. Baca selengkapnya\",\"sameAs\":[\"http:\/\/statorials.org\/id\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials","description":"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/statorials.org\/id\/tes-kereta-panda\/","og_locale":"id_ID","og_type":"article","og_title":"Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials","og_description":"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.","og_url":"https:\/\/statorials.org\/id\/tes-kereta-panda\/","og_site_name":"Statorials","article_published_time":"2023-07-19T03:04:16+00:00","author":"Benjamin anderson","twitter_card":"summary_large_image","twitter_misc":{"Ditulis oleh":"Benjamin anderson","Estimasi waktu membaca":"2 menit"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/statorials.org\/id\/tes-kereta-panda\/","url":"https:\/\/statorials.org\/id\/tes-kereta-panda\/","name":"Cara membuat set pelatihan dan pengujian dari Pandas DataFrame - Statorials","isPartOf":{"@id":"https:\/\/statorials.org\/id\/#website"},"datePublished":"2023-07-19T03:04:16+00:00","dateModified":"2023-07-19T03:04:16+00:00","author":{"@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81"},"description":"Tutorial ini menjelaskan beberapa metode yang dapat Anda gunakan untuk membuat set pelatihan dan pengujian dari satu DataFrame pandas.","breadcrumb":{"@id":"https:\/\/statorials.org\/id\/tes-kereta-panda\/#breadcrumb"},"inLanguage":"id","potentialAction":[{"@type":"ReadAction","target":["https:\/\/statorials.org\/id\/tes-kereta-panda\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/statorials.org\/id\/tes-kereta-panda\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/statorials.org\/id\/"},{"@type":"ListItem","position":2,"name":"Cara membuat set kereta dan pengujian dari pandas dataframe"}]},{"@type":"WebSite","@id":"https:\/\/statorials.org\/id\/#website","url":"https:\/\/statorials.org\/id\/","name":"Statorials","description":"Panduan anda untuk kompetensi statistik!","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/statorials.org\/id\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"id"},{"@type":"Person","@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/3d17a1160dd2d052b7c78e502cb9ec81","name":"Benjamin anderson","image":{"@type":"ImageObject","inLanguage":"id","@id":"https:\/\/statorials.org\/id\/#\/schema\/person\/image\/","url":"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg","contentUrl":"http:\/\/statorials.org\/id\/wp-content\/uploads\/2023\/10\/Dr.-Benjamin-Anderson-96x96.jpg","caption":"Benjamin anderson"},"description":"Halo, saya Benjamin, pensiunan profesor statistika yang menjadi guru Statorial yang berdedikasi. Dengan pengalaman dan keahlian yang luas di bidang statistika, saya ingin berbagi ilmu untuk memberdayakan mahasiswa melalui Statorials. Baca selengkapnya","sameAs":["http:\/\/statorials.org\/id"]}]}},"yoast_meta":{"yoast_wpseo_title":"","yoast_wpseo_metadesc":"","yoast_wpseo_canonical":""},"_links":{"self":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts\/3121"}],"collection":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/comments?post=3121"}],"version-history":[{"count":0,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/posts\/3121\/revisions"}],"wp:attachment":[{"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/media?parent=3121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/categories?post=3121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/statorials.org\/id\/wp-json\/wp\/v2\/tags?post=3121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}