Another method for determining random words
In the previous article, I introduced three methods for judging randomly generated words, generally using the external program spell. Now, the cat finds the following file on Mac OS X:/usr/share/dict/words, where N or more English words are placed:
Apple @ kissAir: dict $ ls-ldh words
Lrwxr-xr-x 1 root wheel 4B 10 18 words-> web2
Apple @ kissAir: dict $ ls-lh web2
-R -- 1 root wheel 2.4 M 9 10 web2
Apple @ kissAir: dict $ wc-n words
Wc: illegal option -- n
Usage: wc [-clmw] [file...]
Apple @ kissAir: dict $ wc-l words
235886 words
One line of one word, that is, a total of more than 0.23 million words, We can discard the spell program and write an is_spell? Method to Determine whether words can be spelled. The following code is added with way4, giving up the command line parameter method, but using the benchmark package to test the performance:
#! /Usr/bin/ruby # code by hopy 2014.12.08 # random create some words and check if a valid word! Require 'tempfile' require 'benchmark' words _ path = "/usr/share/dict/words" f = File. open (words_path, "r") $ lines = f. readlines $ lines. map! {| Word. chomp !} F. closedef rand_words (n = 10000, min_len = 2, max_len = 12) chars = ("".. "z "). to_a * max_len ). freezewords = [] srandn. times do | x | len = min_len + (rand * 1000 ). to_ I % max_lenidxes = [] len. times {idxes <(rand * 100) % 26} chars. shufflewords <chars. values_at (* idxes ). joinidxes. clearend wordsend # ret word that can spell or ret nil. (way1) def spell_word (word) cmd = 'echo # {word} | spell '. chompif cmd = wordreturn nilels Ereturn wordendend # spell all words by tmpfile. (way2) def spell_words (words) puts "using spell_words... "f = Tempfile. new ("# {$ $} _ spell_blablabla") # f = File. open ("spell_test", "w +") # f. write Marshal. dump (words) f. write words. join ("") f. closecmd = 'spell # {f. path} 'no _ spell_words = cmd. split ("\ n") words-no_spell_wordsend # spell all words by tmpfile and spell ret is also use tmpfile. (way3) def spell_words2 (Words) puts "using spell_words2... "f_words = Tempfile. new ("# {$} _ spell_words") f_ret = Tempfile. new ("# {$ $} _ spell_ret") f_ret.closef_words.write words. join ("") f_words.closecmd = 'spell # {f_words.path }>#{ f_ret.path} 'f = File. open (f_ret.path) no_spell_words = f. read. split ("\ n") f. closewords-no_spell_wordsenddef is_spell? (Word) $ lines. include? Wordend # Use is_spell? Determine whether word can be spelled. (Way4) def spell_words3 (words) = beginwords. each do | word | printf "# {word}" if is_spell? (Word) end = endwords. select {| word | is_spell? (Word)} enddef sh_each_spell_word (spell_words) spell_words.each {| word | printf "# {word}"} endwords_count = 2000 $ words = nilputs "words_count is 2000, now test... "Benchmark. bm do | bc. report ("rand_words: \ n") {$ words = rand_words (words_count)}; puts "" bc. report ("way1: spell_word: \ n") {$ words. each {| w | printf "# {w}" if spell_word (w) }}; puts "" bc. report ("way2: spell_words: \ n") {sh_each_spell_word (spell_words ($ words)}; puts "" bc. report ("way3: spell_words2: \ n") {sh_each_spell_word (spell_words2 ($ words)}; puts "" bc. report ("way4: spell_words3: \ n") {sh_each_spell_word (spell_words3 ($ words)}; puts "" end
However, Mac OS X does not contain the spell program. brew does not know which one to install. However, the spell in ubuntu cannot be upgraded. Let's test it with x61 of the local cat tomorrow!
Today is tomorrow! I found that the words files in ubuntu contain less words than those in Mac, and there are only over 90 thousand words. I replaced them with files in Mac, it can be seen that there are more words than actually enumerated by the spell program:
wisy@wisy-ThinkPad-X61:~/src/ruby_src$ ./x.rbwords_count is 2000,now test... user system total realrand_words: 0.050000 0.000000 0.050000 ( 0.069850)way1:spell_word:ho of ts mu so or wag us to lo um ts pa pip mid hip vs no of oboe iv yr re so 0.330000 3.170000 13.480000 ( 29.903239)way2:spell_words:using spell_words...ho of ts mu so or wag us to lo um ts pa pip mid hip vs no of oboe iv yr re so 0.000000 0.000000 0.080000 ( 5.485613)way3:spell_words2:using spell_words2...ho of ts mu so or wag us to lo um ts pa pip mid hip vs no of oboe iv yr re so 0.010000 0.010000 0.100000 ( 4.854248)way4:spell_words3:ho of pob dob mu bo so sa or wag us jo aw to lo um li ca se pa ava bo sho pip mid til tue ya en hip no of di ug oboe io en yr re da eer so ym 36.580000 0.290000 36.870000 ( 37.444370)
The new method (way4) We wrote was the slowest !!! I don't know if I don't try it. I was shocked when I tried it!