mapplyで二重ループを回避してみた

12750 ワード

statistics R 統計学 R テキストリンク

Rのapply系の中でも異彩を放っているmapplyの使い方の例を書いてみました。以下の図のような、
1. 50m×50mの中に2000個の点が散らばっている。
2. それぞれの点について、半径5m以内にいる他の点の数を数える。
という例で考えてみます。

データの準備

library(dplyr)
set.seed(4)
n <- 2000
gx <- runif(n, min=0, max=50)
gy <- runif(n, min=0, max=50)
samp_dat <- data_frame(ID = paste("ID", 1:n, sep = "_"), gx, gy)

このデータから下のような結果がほしい。

ID	周りの数
ID_1	65
ID_2	37
ID_3	43
...	...
ID_2000	79

普通にループを使う

ループ1：対象の点と回りの点の距離を比べる。
ループ2：ループ1を全部の点に行う。

ループのみ

countR <- function(gx, gy, r){
 n_length <- length(gx)
 res <- numeric(n_length)
 for (i in 1:n_length){
  temp <- numeric(n_length)
   for (j in 1:n_length){
        if ((gx[i]-gx[j])^2 + (gy[i]-gy[j])^2 <= r^2) temp[j] <- 1
      }
      res[i] <- sum(temp)
      temp <- numeric(n_length)
    }
  res - 1
}

> system.time(res <- countR(samp_dat$gx, samp_dat$gy, r = 5))
   user  system elapsed
  6.989   0.027   7.015
> str(res)
 num [1:2000] 65 37 43 65 57 59 58 52 52 65 ...

mapplyを使う

１つ目のループをベクトル化。
２つ目のループにmapplyを使う。

mapply

r <- 5
countR2 <- function(n){
    temp <- (gx[n] - gx)^2 + (gy[n] - gy)^2 # gx[n]-gxでベクトル化
    temp[temp <= r^2] %>% length - 1
}

> system.time(res2 <- mapply(countR2, n = 1:n))
   user  system elapsed
  0.170   0.012   0.181

> str(res2)
 num [1:2000] 65 37 43 65 57 59 58 52 52 65 ...

短く書けたし、40倍くらい速くなりました。

Author And Source

この問題について(mapplyで二重ループを回避してみた), 我々は、より多くの情報をここで見つけました https://qiita.com/mattocci27/items/cc9d292d3083fb3902f4

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .