Quickly locate a row in a large file if the internal memory is smaller than the file size

Source: Internet
Author: User
Quickly locate a row in a large file if the memory is less than the file size
For example, there is a file
ABC 56
DEF 100
RET 300
...

The file has 2 columns, the first column is non-repeating, and the 2nd column represents the number of times (as a number).

If the file size is 2G or larger and the memory is only 1G, how to quickly navigate to the "ABC 56" line.

Ask the Danale to give a clear solution.

Memory Large File

Share to:


------Solution--------------------
fopen, then fscanf.
It's good to read one line at a time. Memory is not a limiting factor.
------Solution--------------------
If you build a hash table, do you want to hash the contents of the file first?

Can use other tools to deal with, not necessarily must use the algorithm.
For example, awk:
awk '/abc\t56/{print NR} ' file
You can get the line number of the matching row.

It is suggested that LZ say the specific needs, if only to get the line number, the scheme is many.
But if there are other needs, it is not necessarily the best option for awk to do so.
------Solution--------------------
Reference:
Quote: Reference:

Does anyone know?
If it is a line of reading, then efficiency will not be.
Is there a faster way?
My idea is to build a hash table and then use that hash collision principle to drain the weight based on the hashing algorithm.
I don't know if you have any good ideas.
Don't you have to read the line first and hash it out?

Too slow to read a
line

Yes, reading blocks is better than you need.
------Solution--------------------
Landlord can refer to:
http://www.fantxi.com/blog/archives/php-read-large-file/

Http://sjolzy.cn/php-large-file-read-operation.html
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.