Quickly locate a row in a large file if the memory is less than the file size
For example, there is a file
ABC 56
DEF 100
RET 300
...
The file has 2 columns, the first column is non-repeating, and the 2nd column represents the number of times (as a number).
If the file size is 2G or larger and the memory is only 1G, how to quickly navigate to the "ABC 56" line.
Ask the Danale to give a clear solution.
Memory Large File
Share to:
------Solution--------------------
fopen, then fscanf.
It's good to read one line at a time. Memory is not a limiting factor.
------Solution--------------------
If you build a hash table, do you want to hash the contents of the file first?
Can use other tools to deal with, not necessarily must use the algorithm.
For example, awk:
awk '/abc\t56/{print NR} ' file
You can get the line number of the matching row.
It is suggested that LZ say the specific needs, if only to get the line number, the scheme is many.
But if there are other needs, it is not necessarily the best option for awk to do so.
------Solution--------------------
Reference:
Quote: Reference:
Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I don't know if you have any good ideas.
Don't you have to read the line first and hash it out?
Too slow to read a
line
Yes, reading blocks is better than you need.
------Solution--------------------
Landlord can refer to:
http://www.fantxi.com/blog/archives/php-read-large-file/
Http://sjolzy.cn/php-large-file-read-operation.html