There are 2000w rows of data in a TXT document, the format of the data is as follows
The Walking Dead _mother
Jinchantuoqiao _smile
Farewell My Concubine _love
Impunity _eternity
.......
Eight Immortals crossing _destiny
How to quickly retrieve idioms or English words, please give me algorithm, thank you Daniel
Reply content:
There are 2000w rows of data in a TXT document, the format of the data is as follows
The Walking Dead _mother
Jinchantuoqiao _smile
Farewell My Concubine _love
Impunity _eternity
.......
Eight Immortals crossing _destiny
How to quickly retrieve idioms or English words, please give me algorithm, thank you Daniel
Is your purpose to determine whether an idiom/English exists or how many times the idiom/word appears?
I thought that no matter what method, the biggest possibility is to go through the full text. If you have a very high frequency of this retrieval, these 20 million data will be stored in memory and then indexed to store the fastest, if only run once, then the fastest time is to read the file over time (count the number of occurrences).
Take a SOLR and create an index so that the search is much more efficient