Use Django to clean up data in the database.

Source: Internet
Author: User

Use Django to clean up data in the database.
Description of the data cleaning problem in the database: in my system, due to history and various reasons, the data recorded in the table in the database is problematic, there are repeated and incomplete data solutions: first, because the data volume is large, manual cleaning is definitely not feasible. Then, I want to write an SQL script to update according to the agreed rules. You can use a cursor to traverse the records in the table. However, SQL is a structured query language and is not process-oriented, so although it is possible, it is not convenient to use process-oriented such as C and python. Later, I wanted to create a new method in my project and then call it through the address bar of the browser, you can.
PS: Although Django's orm is very convenient, It is very embarrassing to use it on your own. There are some filtering conditions and syntax rules, I still have to search for examples on the Internet to find out how to use the SQL statement method. Fortunately, I used to execute native SQL statements during the cleaning process.
The Code is as follows:

   def datetimestr():    return datetime.now().strftime('%Y%m%d-%H%M%S')+'>>>'   def update_dcData(req):        log_path='apps/dc/l10n_reports'    update_dcData_log=open(log_path+'updateDcdataLog.log','w+')    sql_getProjectIDs='select a.project_id from' \                       ' (select count(*) num,project_id from dc_data where lableName=%s group by project_id) a,' \                       ' (select count(*) num ,project_id from management_project_target_lang group by project_id) b' \                       ' where a.project_id=b.project_id and a.num!=b.num order by project_id'        sql_getAllProjectIDs='select project_id from management_project'        sql_getLanguageSections='select b.name from management_project_target_lang a,management_l10nlanguage b' \                            ' where a.project_id=%s and a.l10nlanguage_id=b.id and b.id!="1"'                                sql_getRecords='select id, languageSection,value from dc_data where lableName =%s and project_id=%s and important="1"'        sql_addRecordByID='insert into dc_data(lableName,languageSection,type,value,project_id,task_id,' \                'important,unit,settlement,workload) ' \                'select lableName,languageSection,type,value,project_id,task_id,important,unit,settlement,workload ' \                'from dc_data where id=%s'     sql_updateLgs='update dc_data set languageSection=%s where id=%s'        sql_getLableNames='select lableName from dc_data where lableName like "%%_all" group by lableName'        update_dcData_log.write(datetimestr()+'sql_getLableNames'+'>>>'+sql_getLableNames+'\n')    update_dcData_log.write(datetimestr()+'sql_getRecords'+'>>>'+sql_getRecords+'\n')    update_dcData_log.write(datetimestr()+'sql_getProjectIDs'+'>>>'+sql_getProjectIDs+'\n')    update_dcData_log.write(datetimestr()+'sql_getLanguageSections'+'>>>'+sql_getLanguageSections+'\n')        update_dcData_log.write(datetimestr()+'sql_addRecordByID'+'>>>'+sql_addRecordByID+'\n')        update_dcData_log.write(datetimestr()+'sql_updateLgs'+'>>>'+sql_updateLgs+'\n')        context=Context({'msg':'Success'})    resp=render_to_response("report/clean_data.html", context,                               context_instance=RequestContext(req))        cursor=connection.cursor()    try:        cursor.execute(sql_getLableNames)        lableNames=cursor.fetchall()    except Exception,e:        update_dcData_log.write(datetimestr()+'execute sql_getLableNames error '+str(e)+'\n')        context=Context({'msg':'Error'})        return render_to_response("report/clean_data.html", context,                               context_instance=RequestContext(req))     for lableName in lableNames:        try:            cursor.execute(sql_getProjectIDs,[lableName[0]])            projectIDs=cursor.fetchall()        except Exception,e:            update_dcData_log.write(datetimestr()+'execute sql_getProjectIDs error '+str(e)+'\n')            context=Context({'msg':'Error'})            return render_to_response("report/clean_data.html", context,                               context_instance=RequestContext(req))         for pid in projectIDs:            try:                cursor.execute(sql_getRecords,[lableName[0],str(pid[0])])                records=cursor.fetchall()                cursor.execute(sql_getLanguageSections,[str(pid[0])])                languageSections=cursor.fetchall()            except Exception,e:                update_dcData_log.write(datetimestr()+'execute sql_getRecords or sql_getLanguageSections error '+str(e)+'\n')                context=Context({'msg':'Error'})                return render_to_response("report/clean_data.html", context,                               context_instance=RequestContext(req))             values,lgs=[],[]            baseValue=str(records[0][2])            baseID=str(records[0][0])            for item in records:                lgs.append(str(item[1]))                values.append(str(item[2]))                if baseValue!=str(item[2]):                    baseValue='false'            targetLgs=[str(item[0]) for item in languageSections]            if len(lgs)<1 or len(targetLgs)<1:                baseValue=='false'            if 'all' not in lgs:                try:                    cursor.execute(sql_addRecordByID,[baseID])                    cursor.execute(sql_updateLgs,['all',baseID])                    transaction.commit_unless_managed()                except Exception,e:                    update_dcData_log.write(datetimestr()+'execute sql_addRecordByID or sql_updateLgs error (all)'+str(e)+'\n')                    context=Context({'msg':'Error'})                    return render_to_response("report/clean_data.html", context,                                   context_instance=RequestContext(req))                                             update_dcData_log.write(datetimestr()+"all record is add into dc_data,the lableName and projectID were "+str(lableName[0])+'-'+str(pid[0])+'\n')                        if baseValue=='false':                update_dcData_log.write(datetimestr()+"please update this record mutually,the lableName and projectID were "+str(lableName[0])+'-'+str(pid[0])+'\n')            else:                if len(lgs)>len(targetLgs):                    update_dcData_log.write(datetimestr()+"the lableName languageSection length is longer than target numbers lableName and projectID were "+str(lableName[0])+'-'+str(pid[0])+'\n')                else:                    for lg in targetLgs:                        if lg not in lgs:                            try:                                cursor.execute(sql_addRecordByID,[baseID])                                cursor.execute(sql_updateLgs,[lg,baseID])                                transaction.commit_unless_managed()                            except Exception,e:                                update_dcData_log.write(datetimestr()+'execute sql_addRecordByID or sql_updateLgs error (lg)'+str(e)+'\n')                                context=Context({'msg':'Error'})                                return render_to_response("report/clean_data.html", context,                                   context_instance=RequestContext(req))                                                             update_dcData_log.write(datetimestr()+lg+" record is add into dc_data,the lableName and projectID were "+str(lableName[0])+'-'+str(pid[0])+'\n')                                                update_dcData_log.close()                             return  resp   



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.