テキストファイルから、特定の文字列が最後に出現する行以降を抽出する


目的

テキストファイルから、特定の文字列が最後に出現する行以降を抽出する際の備忘録です。

テスト用の test.txt を用意する

randomtextgenerator.comでランダムなテキストを用意します。
そこに、特定の文字列として"test_check_point_string"をいくつか加えます。

テスト用のtest.txt
test.txt
test_check_point_string
Offices parties lasting outward nothing age few resolve. Impression to discretion understood to we interested he excellence. Him remarkably use projection collecting. Going about eat forty world has round miles. Attention affection at my preferred offending shameless me if agreeable. Life lain held calm and true neat she. Much feet each so went no from. Truth began maids linen an mr to after. 

test_check_point_string
Literature admiration frequently indulgence announcing are who you her. Was least quick after six. So it yourself repeated together cheerful. Neither it cordial so painful picture studied if. Sex him position doubtful resolved boy expenses. Her engrossed deficient northward and neglected favourite newspaper. But use peculiar produced concerns ten. 

test_check_point_string
Started several mistake joy say painful removed reached end. State burst think end are its. Arrived off she elderly beloved him affixed noisier yet. An course regard to up he hardly. View four has said does men saw find dear shy. Talent men wicket add garden. 

test_check_point_string
Do to be agreeable conveying oh assurance. Wicket longer admire do barton vanity itself do in it. Preferred to sportsmen it engrossed listening. Park gate sell they west hard for the. Abode stuff noisy manor blush yet the far. Up colonel so between removed so do. Years use place decay sex worth drift age. Men lasting out end article express fortune demands own charmed. About are are money ask how seven. 

test_check_point_string
Up branch to easily missed by do. Admiration considered acceptance too led one melancholy expression. Are will took form the nor true. Winding enjoyed minuter her letters evident use eat colonel. He attacks observe mr cottage inquiry am examine gravity. Are dear but near left was. Year kept on over so as this of. She steepest doubtful betrayed formerly him. Active one called uneasy our seeing see cousin tastes its. Ye am it formed indeed agreed relied piqued. 

test_check_point_string
Sentiments two occasional affronting solicitude travelling and one contrasted. Fortune day out married parties. Happiness remainder joy but earnestly for off. Took sold add play may none him few. If as increasing contrasted entreaties be. Now summer who day looked our behind moment coming. Pain son rose more park way that. An stairs as be lovers uneasy. 

test_check_point_string
In reasonable compliment favourable is connection dispatched in terminated. Do esteem object we called father excuse remove. So dear real on like more it. Laughing for two families addition expenses surprise the. If sincerity he to curiosity arranging. Learn taken terms be as. Scarcely mrs produced too removing new old. 

test_check_point_string
Ignorant branched humanity led now marianne too strongly entrance. Rose to shew bore no ye of paid rent form. Old design are dinner better nearer silent excuse. She which are maids boy sense her shade. Considered reasonable we affronting on expression in. So cordial anxious mr delight. Shot his has must wish from sell nay. Remark fat set why are sudden depend change entire wanted. Performed remainder attending led fat residence far. 

test_check_point_string
Improve him believe opinion offered met and end cheered forbade. Friendly as stronger speedily by recurred. Son interest wandered sir addition end say. Manners beloved affixed picture men ask. Explain few led parties attacks picture company. On sure fine kept walk am in it. Resolved to in believed desirous unpacked weddings together. Nor off for enjoyed cousins herself. Little our played lively she adieus far sussex. Do theirs others merely at temper it nearer. 

test_check_point_string
In it except to so temper mutual tastes mother. Interested cultivated its continuing now yet are. Out interested acceptance our partiality affronting unpleasant why add. Esteem garden men yet shy course. Consulted up my tolerably sometimes perpetual oh. Expression acceptance imprudence particular had eat unsatiable. 

特定の文字列を抽出

特定の文字列として加えた"test_check_point_string"の最後の行番号を抽出する。

$ nl -ba test.txt | grep "test_check_point_string" 
     1  test_check_point_string
     4  test_check_point_string
     7  test_check_point_string
    10  test_check_point_string
    13  test_check_point_string
    16  test_check_point_string
    19  test_check_point_string
    22  test_check_point_string
    25  test_check_point_string
    28  test_check_point_string
$ nl -ba test.txt | grep "test_check_point_string" | tail -1 
    28  test_check_point_string
$ nl -ba test.txt | grep "test_check_point_string" | tail -1 | cut -f 1 | sed 's/ //g' 
28

指定した行番号の文字列を抽出する場合

$ sed -n 28p test.txt
test_check_point_string

特定の文字列として加えた"test_check_point_string"の最後の行以降の文字列を全て抽出する場合

$ nl -ba test.txt | grep "test_check_point_string" | tail -1 | cut -f 1 | sed 's/ //g' 
28
$ cat test.txt | wc -l | cut -f 1 | sed 's/ //g'
30
$ sed -n 28,30p test.txt
test_check_point_string
In it except to so temper mutual tastes mother. Interested cultivated its continuing now yet are. Out interested acceptance our partiality affronting unpleasant why add. Esteem garden men yet shy course. Consulted up my tolerably sometimes perpetual oh. Expression acceptance imprudence particular had eat unsatiable. 

スクリプト化すると以下

test.sh
#!/bin/bash

CHECK_STR="test_check_point_string"
INPUT_FILE="test.txt"

START_LINE=`nl -ba ${INPUT_FILE} | grep "${CHECK_STR}" | tail -1 | cut -f 1 | sed 's/ //g'`
END_LINE=`cat ${INPUT_FILE} | wc -l | cut -f 1 | sed 's/ //g'`
sed -n ${START_LINE},${END_LINE}p ${INPUT_FILE} 
$ ./test.sh 
test_check_point_string
In it except to so temper mutual tastes mother. Interested cultivated its continuing now yet are. Out interested acceptance our partiality affronting unpleasant why add. Esteem garden men yet shy course. Consulted up my tolerably sometimes perpetual oh. Expression acceptance imprudence particular had eat unsatiable. 

もっといい方法ありましたら指摘いただけますと幸いです。

参考

randomtextgenerator.com
【 nl 】コマンド――テキストファイルを行番号付きで出力する
sedでこういう時はどう書く?