【杭電2015年12月校試合H】【模擬STL-MAP STL-SET stringstream】Study Words articleから習ったことのない頻度の最高単語10個を抽出


Study Words
Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 226    Accepted Submission(s): 80
Problem Description
Learning English is not easy, vocabulary troubles me a lot.
One day an idea came up to me: I download an article every day, choose the 10 most popular new words to study.
A word's popularity is calculated by the number of its occurrences.
Sometimes two or more words have the same occurrences, and then the word with a smaller lexicographic has a higher popularity.
 
Input
T in the first line is case number.
Each case has two parts.

...


...

Between and are some old words (no more than 10000) I have already learned, that is, I don't need to learn them any more.
Words between and contain letters ('a'~'z','A'~'Z') only, separated by blank characters (' ','' or '\t').
Between
and
is an article (contains fewer than 1000000 characters).
Only continuous letters ('a'~'z','A'~'Z') make up a word. Thus words like "don't"are regarded as two words "don"and "t”, that's OK.
Treat the uppercase as lowercase, so "Thanks"equals to "thanks". No words will be longer than 100.
As the article is downloaded from the internet, it may contain some Chinese words, which I don't need to study.
 
Output
For each case, output the top 10 new words I should study, one in a line.
If there are fewer than 10 new words, output all of them.
Output a blank line after each case.
 
Sample Input

   
   
   
   
2 <oldwords> how aRe you </oldwords> <article> --How old are you? --Twenty. </article> <oldwords> google cn huluobo net i </oldwords> <article> : I love google,dropbox,firefox very much. Everyday I open my computer , open firefox , and enjoy surfing on the inter- net. But these days it's strange that searching "huluobo" is unavail- able. What's wrong with "huluobo"? </article>

 
Sample Output

   
   
   
   
old twenty firefox open s able and but computer days dropbox enjoy

 
#include<stdio.h>
#include<iostream>
#include<sstream>
#include<algorithm>
#include<ctype.h>
#include<string.h>
#include<vector>
#include<set>
#include<map>
using namespace std;
int casenum,casei;
typedef long long LL;
const int N=105;
int n,m;
char s[N];
char oldwords[]="</oldwords>";
char article[]="</article>";
set<string>sot;
map<string,int>mop;
map<string,int>::iterator it;
const int L=1e6+10;char ss[L];
vector<pair<int,string> >b;
int main()
{
    scanf("%d",&casenum);
    for(casei=1;casei<=casenum;++casei)
    {
        sot.clear();mop.clear();
        while(1)
        {
            scanf("%s",s);
            for(int i=0;s[i];++i)s[i]=tolower(s[i]);
            if(!strcmp(s,oldwords))break;
            sot.insert(s);
        }
		scanf("%s",s);getchar();
        int l=0;
        while(1)
        {
            gets(ss+l);int len=strlen(ss+l);
            if(!strcmp(ss+l,article))break;
            for(int i=l;ss[i];++i)
            {
				if(!isalpha(ss[i]))ss[i]=' ';
                else ss[i]=tolower(ss[i]);
            }
            l+=len;
            ss[l++]=' ';
        }ss[l]=0;

        stringstream cinn(ss);
        while(cinn>>s)
        {
            if(sot.find(s)==sot.end())++mop[s];
        }
        b.clear();
        for(it=mop.begin();it!=mop.end();++it)
        {
            b.push_back(make_pair(-it->second,it->first));
        }
        sort(b.begin(),b.end());
        for(int i=0;i<min(10,(int)b.size());++i)cout<<b[i].second<<endl;
        puts("");
    }
    return 0;
}

/*
【trick&&  】
1,        = =              ,               ,
	              。               ,       +4   。。。

2,        ,       Ascii     
3,  strcmp ,       ...
4,        ,            >_<
	<oldwords>
	</oldwords>
	<article>
	/article
	/article>
	</article
	</article>

【  】
      ,         
<oldwords>
...
</oldwords>
<article>
...
</article>

  article    ,     10            
  :
1,oldwords            ,         。
2,      
3,      
4,            ,   don't          don   t   
5,      ,        
6,    10 ,  (  ,   )       ,          。
7,          100
8,article      1e7

【  】
   STL-SET STL-MAP

【  】
        ——

1,    。
2,SET          
3,       
	    ,      。       ——
	(1)stringstream cinn(s)
	(2)scanf(%[^])
4,MAP          
5, MAP      ,  (  ,   )  ,   10   。

【     &&  】
O(1e6 log(1e6))

0msAC,        = =

*/