Notes on data interaction with external commands when calling external commands

Source: Internet
Author: User

Two days ago, I plan to download some images. I analyzed it with Fiddler. As a result, a large number of JS operations on the webpage made my head big.

Go has a V8 engine package, but it is quite troublesome to compile the engine.

You can also use selenium-Python. This is the decision. Because selenium does not have a version of python3.4, You have to reinstall python2.7.

To put it another way, python3 is indeed much better than python2's organization and other aspects. Unfortunately, the scarcity of third-party databases is a serious injury.

At that time, my idea was like this. I used Python and selenium to operate the browser, extract the webpage image address and title, and pass it to go for download by go.

The reason for the download from go is that there are problems with network libraries such as the urllib of python2, And the download of resources often fails, but no error is returned. In contrast, the Go net Library ensures that the download is complete as long as the error value is nil.

I have written a Python script that accepts a URL from the standard input and returns the title. Then, it accesses each cartoon page and returns the image URL. An end sign is output. After the download is complete, you can accept the URL for the next download.

You can use the exec. command to start the Python script, and then use the stdinpipe () and stdoutpipe () returned values of * cmd type to realize interaction with sub-processes. Use start () method to start the command.

It is actually a very simple program, and there are less than 80 lines in the go program, log, lock, and clipboard monitoring.

The drama has arrived. I found that when the title contains Chinese characters, the program will go wrong. The main program does not obtain the title, and the sub-command fails.

After checking, I think there are two possible reasons:

1) Chinese character encoding problem. Python will output Chinese characters encoded as local (ANSI, GBK) to the standard output, but go will treat them as UTF-8 encoded.

2) Non-ASCII codes are not supported when external commands are called and interacted.

Finally, my solution is to encode the Unicode string as a UTF-8 byte string in the Python script and then use base64 encoding. Go uses base64 decoding. In this case, no matter what the reason is. This is inconvenient because it mainly takes time to download.

But I don't know the specific reason.

Notes on data interaction with external commands when calling external commands

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.