Firebase×Bigqueryでクラシックリテンションを集計する
動作環境
FirebaseとBigQueryを統合し、FirebaseAnalyticsのデータをBigQueryにインポートしている環境です。
やりたいこと
初回起動(first_open)の日を起点に、2日目起動率、3日目起動率、7日目起動率(クラシックリテンション)を初回起動日別に取得し、ダッシュボード化したい。
リテンションの定義はアプリのリテンションをSQLで出して可視化するを参考にしました。
考え方
WITH句の中のstartで初回起動日とuser_pseudo_idの一覧を作成。
usedateで初回起動に限らない起動日とuser_pseudo_idの一覧を作成。
その後、起動日と初回起動日の差分を出すことで何日後に起動したかを導き出し、合計ユーザー数を出して初回起動数との比較でリテンションを計算しています。
クエリ
WITH start AS
(SELECT
user_pseudo_id,
EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") AS start_date
FROM
`hogehoge.analytics_xxxxxxxx.events_*` ---BigQueryの集計対象となるテーブル名
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY))
AND event_name = "first_open"
AND EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") BETWEEN DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY) AND DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY)
GROUP BY
user_pseudo_id,
start_date
),
usedate AS
(SELECT
user_pseudo_id,
EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") AS use_date
FROM
`hogehoge.analytics_xxxxxxxx.events_*` ---BigQueryの集計対象となるテーブル名
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY))
GROUP BY
user_pseudo_id,
use_date
),
retention AS(
SELECT
start.user_pseudo_id,
start.start_date,
IFNULL(DATE_DIFF(usedate.use_date, start.start_date, day), 0) AS retention_day
FROM
start
INNER JOIN usedate ON start.user_pseudo_id = usedate.user_pseudo_id AND usedate.use_date >= start.start_date
GROUP BY
start.user_pseudo_id,
start.start_date,
retention_day),
retention_sum AS(
SELECT
retention.start_date,
retention.retention_day,
count(retention.user_pseudo_id) AS uu
FROM
retention
GROUP BY
retention.start_date,
retention.retention_day)
SELECT
start_date,
SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS startuu,
SUM(CASE WHEN retention_day = 1 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS oneday,
SUM(CASE WHEN retention_day = 2 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS twoday,
SUM(CASE WHEN retention_day = 3 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS threeday,
SUM(CASE WHEN retention_day = 7 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS sevenday
FROM
retention_sum
GROUP BY
start_date
ORDER BY
start_date ASC
アウトプット
WITH start AS
(SELECT
user_pseudo_id,
EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") AS start_date
FROM
`hogehoge.analytics_xxxxxxxx.events_*` ---BigQueryの集計対象となるテーブル名
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY))
AND event_name = "first_open"
AND EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") BETWEEN DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY) AND DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY)
GROUP BY
user_pseudo_id,
start_date
),
usedate AS
(SELECT
user_pseudo_id,
EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Asia/Tokyo") AS use_date
FROM
`hogehoge.analytics_xxxxxxxx.events_*` ---BigQueryの集計対象となるテーブル名
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 14 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE('Asia/Tokyo'), INTERVAL 1 DAY))
GROUP BY
user_pseudo_id,
use_date
),
retention AS(
SELECT
start.user_pseudo_id,
start.start_date,
IFNULL(DATE_DIFF(usedate.use_date, start.start_date, day), 0) AS retention_day
FROM
start
INNER JOIN usedate ON start.user_pseudo_id = usedate.user_pseudo_id AND usedate.use_date >= start.start_date
GROUP BY
start.user_pseudo_id,
start.start_date,
retention_day),
retention_sum AS(
SELECT
retention.start_date,
retention.retention_day,
count(retention.user_pseudo_id) AS uu
FROM
retention
GROUP BY
retention.start_date,
retention.retention_day)
SELECT
start_date,
SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS startuu,
SUM(CASE WHEN retention_day = 1 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS oneday,
SUM(CASE WHEN retention_day = 2 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS twoday,
SUM(CASE WHEN retention_day = 3 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS threeday,
SUM(CASE WHEN retention_day = 7 THEN uu ELSE NULL END) / SUM(CASE WHEN retention_day = 0 THEN uu ELSE NULL END) AS sevenday
FROM
retention_sum
GROUP BY
start_date
ORDER BY
start_date ASC
Author And Source
この問題について(Firebase×Bigqueryでクラシックリテンションを集計する), 我々は、より多くの情報をここで見つけました https://qiita.com/asucii/items/067ce82b1bb4ef06117e著者帰属:元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。
Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .