Molecule to atoms

Source: Internet
Author: User

For a given chemical formula represented by a string, count the number of atoms of each element contained in the molecule and return an object.

1Water ='H2O' 2 parse_molecule (water)3 #return {h:2, o:1}4 5Magnesium_hydroxide ='Mg (OH) 2'parse_molecule (magnesium_hydroxide)6 #return {mg:1, o:2, h:2}7 8var Fremy_salt ='K4[on (SO3) 2]2' 9 parse_molecule (Fremysalt)Ten #return {k:4, o:14, N:2, s:4}

The main idea is to convert the molecular expression into an atom (dictionary representation), the difficulty is 3kyu on the codewars, the difficulty lies in the analysis of various conditions, to prevent cross-border, there are various restrictions in the formula.

My idea is probably to put square brackets, curly braces are converted to parentheses first, then the most inner layer, and then the outer brackets expand, and finally get a non-bracket expression, which is good to deal with. Here is the question of finding the most inner brackets, which I understand is to find the first ') ', and then look forward to the corresponding ' (', with the expanded results instead of ' (...) 2 ', I use 2 instead of the number behind the brackets, it is possible that the number is 1, naturally omitted, we will be in the conversion process of 1. In the final processing, we should also note that 1 is omitted, need to be calculated when added.

The code is as follows:

1 defParse_molecule (Formula):2Formula_dict = {}3     #Replace []{} to ()4      forBracketinch '[{':5Formula = Formula.replace (bracket,'(')6      forBracketinch ']}':7Formula = Formula.replace (bracket,')')8     9     if '(' inchformula:TenHas_bracket =True One     Else: AHas_bracket =False -      whileHas_bracket: -         #looking for the inner layer () the          forIinchRange (len (formula)): -             ifFormula[i] = =')': -                  Break -          forJinchRange (len (formula[:i))-1,-1, 1): +             ifFORMULA[J] = ='(': -                  Break +         #If there is an omission of 1, fill up the A         ifi+1 = Len (Formula)or  notFormula[i+1].isdigit (): atSub_formula = formula[j:i+1] -             #to prevent subsequent replace errors, a temporary variable is set, otherwise -             #if direct Sub_formula = formula[j:i+1] + ' 1 ' -             #Sub_formula becomes a substring that is not in the formula, does not execute -             #This is going to go on all the time. -TMP = Sub_formula +'1' in         Else: -Sub_formula = formula[j:i+2] toTMP =Sub_formula +Parsed_sub_formula =Parse_paren (TMP) -Formula =formula.replace (Sub_formula, Parsed_sub_formula) the         if '(' inchformula: *Has_bracket =True $         Else:Panax NotoginsengHas_bracket =False -     #Processing of non-() Molecular Expressions thei =0 +      whileI <Len (Formula): Aj = i+1 the         ifJ < Len (Formula) andformula[j].islower (): +J + = 1 -TMP =Formula[i:j] $         #attention to the processing of the boundary prevents J from crossing $         #I have a small bug here, I assume that the atomic subscript is up to two bits, if three bits appear -         #will take the third position as an element and subscript 1 . -         #I didn't expect it to pass. the         ifJ < Len (Formula) andformula[j].isdigit (): -K = j+1Wuyi             ifK < Len (formula) andformula[k].isdigit (): theFORMULA_DICT[TMP] = formula_dict.get (tmp, 0) + int (formula[j:k+1]) -i = k+1 Wu             Else: -FORMULA_DICT[TMP] = formula_dict.get (tmp, 0) +Int (formula[j]) Abouti = j+1 $         elifJ < Len (Formula) andformula[j].isupper (): -FORMULA_DICT[TMP] = formula_dict.get (tmp, 0) + 1 -i =J -         elifj = =Len (Formula): AFORMULA_DICT[TMP] = formula_dict.get (tmp, 0) + 1 +              Break the  -     returnformula_dict $  the defParse_paren (sub_formula): theresult = {} thetimes = Int (sub_formula[-1]) thei = 1 -      whileI < Len (Sub_formula)-2: inj = i+1 the         ifsub_formula[j].islower (): theJ + = 1 AboutTMP =Sub_formula[i:j] the         ifsub_formula[j].isdigit (): theK = j+1 the             #It is also assumed that the atom is labeled as a maximum of two bits +             ifK < Len (Sub_formula)-2 andsub_formula[k].isdigit (): -RESULT[TMP] = result.get (tmp, 0) + int (sub_formula[j:k+1]) * Times thei = k+1Bayi             Else: theRESULT[TMP] = result.get (tmp, 0) + int (sub_formula[j]) * Times thei = j+1 -         elifSub_formula[j].isupper ()orSUB_FORMULA[J] = =')': -RESULT[TMP] = result.get (tmp, 0) + 1* Times thei =J the  thet = [] the      forKey, ValinchResult.iteritems (): - t.append (Key) the t.append (str (val)) the     return "'. Join (t) the 94 #when the test was deliberately added some messy molecular expressions, but also in line with the rules the PrintParse_molecule ('K4[on (SO3) 2]2') the PrintParse_molecule ('(H2O) H10') the PrintParse_molecule ('(OH123) 2')

Although also passed, but the code of the bug has time to change (do not know when, anyway, was tortured, next time ...) The level is too poor)

But it seems to be better with regular expressions, then stay tuned ...

Molecule to atoms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.