Large Memory files
For example, there is a file
ABC 56
DEF 100
RET 300
...
The file has 2 columns, the first column is non-repeating, and the 2nd column represents the number of times (as a number).
If the file size is 2G or larger and the memory is only 1G, how to quickly navigate to the "ABC 56" line.
Ask the Danale to give a clear solution.
Reply to discussion (solution)
Don't understand what you mean?
If you open a file to quickly find a line, you can use VI or more to open the file;
Then enter:/ABC return to the good
fopen, then fscanf.
It's good to read one line at a time. Memory is not a limiting factor.
Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I wonder if you have any good ideas.
If you build a hash table, do you want to hash the contents of the file first?
Can use other tools to deal with, not necessarily must use the algorithm.
For example, awk:
awk '/abc\t56/{print NR} ' file
You can get the line number of the matching row.
It is suggested that LZ say the specific needs, if only to get the line number, the scheme is many.
But if there are other needs, it is not necessarily the best option for awk to do so.
Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I don't know if you have any good ideas. Don't you have to read the line first and hash it out?
Too slow to read a line
Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I don't know if you have any good ideas. Don't you have to read the line first and hash it out?
Too slow to read a line
Yes, reading blocks is better than you need.
Landlord can refer to:
http://www.fantxi.com/blog/archives/php-read-large-file/
Http://sjolzy.cn/php-large-file-read-operation.html
If you build a hash table, do you want to hash the contents of the file first?
Can use other tools to deal with, not necessarily must use the algorithm.
For example, awk:
awk '/abc\t56/{print NR} ' file
You can get the line number of the matching row.
It is suggested that LZ say the specific needs, if only to get the line number, the scheme is many.
But if there are other needs, it is not necessarily the best option for awk to do so.
How can demand be quickly found? For example, I want to know the number behind ABC, or the number behind def ...
Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I don't know if you have any good ideas. Don't you have to read the line first and hash it out?
Too slow to read a line
How to read a piece of memory? Can you give me an example?