In Python, Flask and MongoDB are used to build a simple image server,
1. Preparations
After installing pymongo Through pip or easy_install, you can use Python to call mongodb.
Then install a flask to serve as a web server.
Of course, mongo has to be installed. For Ubuntu users, especially those who use Server 12.04, it takes a little time to install the latest version.
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.listsudo apt-get updatesudo apt-get install mongodb-10gen
If you think that, just like me, you can tell users by uploading the file name suffix that the uploaded file is completely holding yam and deceiving yourself as a cucumber, You 'd better prepare a Pillow library.
Copy codeThe Code is as follows:
Pip install Pillow
Or (more suitable for Windows Users)
Copy codeThe Code is as follows:
Easy_install Pillow
2. Opening scene
2.1 Flask File Upload
The example on the Flask official website has been split into two parts, which makes it impossible to speak out. Here we will first get the simplest one, regardless of what files are first made up.
import flaskapp = flask.Flask(__name__)app.debug = True@app.route('/upload', methods=['POST'])def upload(): f = flask.request.files['uploaded_file'] print f.read() return flask.redirect('/')@app.route('/')def index(): return ''' <!doctype html>
Note: In the upload function, use flask. request. files [KEY] to obtain the object to be uploaded. The KEY is the name value of the input in the form.
Because the content is output in the background, it is best to test it with a plain text file.
2.2 save to mongodb
If you are not so specific, you only need
import pymongoimport bson.binaryfrom cStringIO import StringIOapp = flask.Flask(__name__)app.debug = Truedb = pymongo.MongoClient('localhost', 27017).testdef save_file(f): content = StringIO(f.read()) db.files.save(dict( content= bson.binary.Binary(content.getvalue()), ))@app.route('/upload', methods=['POST'])def upload(): f = flask.request.files['uploaded_file'] save_file(f) return flask.redirect('/')
Feed the content into a bson. binary. Binary object and then throw it into mongodb.
Now try to upload another file. In mongo shell, you can see it through db. files. find.
However, the content field is almost invisible to the naked eye. Even in plain text files, mongo will display Base64 encoding.
2.3 provide File Access
Given the ID (as part of the URI) of the file stored in the database, the content of the file is returned to the browser, as shown below:
def save_file(f): content = StringIO(f.read()) c = dict(content=bson.binary.Binary(content.getvalue())) db.files.save(c) return c['_id']@app.route('/f/<fid>')def serve_file(fid): f = db.files.find_one(bson.objectid.ObjectId(fid)) return f['content']@app.route('/upload', methods=['POST'])def upload(): f = flask.request.files['uploaded_file'] fid = save_file(f) return flask.redirect( '/f/' + str(fid))
After the file is uploaded, the upload function will jump to the corresponding file browsing page. In this way, the content of the text file can be properly previewed, if not so picky line breaks and consecutive spaces are eaten by the browser.
2.4 When the file cannot be found
In either case, the database ID format is incorrect. In this case, pymongo throws an exception bson. errors. InvalidId. Second, the object (!) cannot be found (!), In this case, pymongo will return None.
This is the case for simplicity.
@app.route('/f/<fid>')def serve_file(fid): import bson.errors try: f = db.files.find_one(bson.objectid.ObjectId(fid)) if f is None: raise bson.errors.InvalidId() return f['content'] except bson.errors.InvalidId: flask.abort(404)
2.5 correct MIME
From now on, you have to strictly control the uploaded files. text files, dogs and scissors cannot be uploaded.
Before determining the image file, let's say that we use Pillow for real scenarios.
from PIL import Imageallow_formats = set(['jpeg', 'png', 'gif'])def save_file(f): content = StringIO(f.read()) try: mime = Image.open(content).format.lower() if mime not in allow_formats: raise IOError() except IOError: flask.abort(400) c = dict(content=bson.binary.Binary(content.getvalue())) db.files.save(c) return c['_id']
Then, try uploading a text file. The image file can be uploaded normally. no, it is not normal, because after the jump is completed, the server does not provide the correct mimetype, so we still preview a piece of binary Garbled text.
To solve this problem, you must store the MIME in the database together. In addition, the mimetype must be correctly transmitted when the file is given.
def save_file(f): content = StringIO(f.read()) try: mime = Image.open(content).format.lower() if mime not in allow_formats: raise IOError() except IOError: flask.abort(400) c = dict(content=bson.binary.Binary(content.getvalue()), mime=mime) db.files.save(c) return c['_id']@app.route('/f/<fid>')def serve_file(fid): try: f = db.files.find_one(bson.objectid.ObjectId(fid)) if f is None: raise bson.errors.InvalidId() return flask.Response(f['content'], mimetype='image/' + f['mime']) except bson.errors.InvalidId: flask.abort(404)
Of course, the original object does not have the mime attribute, so it is best to clear the original data with db. files. drop () in mongo shell first.
2.6 not modified based on the upload time
HTTP 304 not modified can be used to squeeze the browser cache and save bandwidth as much as possible. Three operations are required.
1) record the last file upload time
2) When the browser requests this file, a timestamp string is inserted into the request header.
3) when the browser requests a file, it tries to get the timestamp from the request header. If it is consistent with the file timestamp, it will directly 304
The code is
import datetimedef save_file(f): content = StringIO(f.read()) try: mime = Image.open(content).format.lower() if mime not in allow_formats: raise IOError() except IOError: flask.abort(400) c = dict( content=bson.binary.Binary(content.getvalue()), mime=mime, time=datetime.datetime.utcnow(), ) db.files.save(c) return c['_id']@app.route('/f/<fid>')def serve_file(fid): try: f = db.files.find_one(bson.objectid.ObjectId(fid)) if f is None: raise bson.errors.InvalidId() if flask.request.headers.get('If-Modified-Since') == f['time'].ctime(): return flask.Response(status=304) resp = flask.Response(f['content'], mimetype='image/' + f['mime']) resp.headers['Last-Modified'] = f['time'].ctime() return resp except bson.errors.InvalidId: flask.abort(404)
Then, you have to get a script to add a timestamp to the existing image in the database.
In addition, NoSQL DB does not have any advantages in this environment, and it is almost the same as RDB.
2.7 use SHA-1 to remove duplicates
Unlike cola in the refrigerator, in most cases, you certainly do not want a big wave of identical images in the database. images, along with their EXIFF and other data information, should be unique in the database. In this case, it is better to use a slightly stronger Hash technology to detect images.
To achieve this goal, the simplest thing is to create a unique SHA-1 index, so that the database will prevent the same thing from being put in.
Create a unique index in the MongoDB table and execute it (Mongo console)
Copy codeThe Code is as follows:
Db. files. ensureIndex ({sha1: 1}, {unique: true })
If your database contains multiple records, MongoDB reports an error. this seemingly harmonious and harmless index operation is told that there are repeated null values in the database (in fact, the existing entries in the database do not have this attribute at all ). different from general RDB, MongoDB requires null or non-existent attribute values, so these ghost attributes will cause the unique index to fail to be created.
There are three solutions:
1) delete all the current data (it must be an irresponsible way to test the database !)
2) create a sparse index. This index does not require the ghost attribute to be unique. However, when multiple null values exist, it will still be determined to be repeated (this can be done regardless of the existing data)
3) write a script to run the database once, translate all the stored data, re-calculate SHA-1, and then store the data.
The specific method is random. Assuming that the problem has been fixed and the index has been completed, the rest is the Python code.
import hashlibdef save_file(f): content = StringIO(f.read()) try: mime = Image.open(content).format.lower() if mime not in allow_formats: raise IOError() except IOError: flask.abort(400) sha1 = hashlib.sha1(content.getvalue()).hexdigest() c = dict( content=bson.binary.Binary(content.getvalue()), mime=mime, time=datetime.datetime.utcnow(), sha1=sha1, ) try: db.files.save(c) except pymongo.errors.DuplicateKeyError: pass return c['_id']
It's okay to upload files. however, according to the above logic, if an existing file is uploaded, the returned c ['_ id'] will be a non-existent data id. to solve this problem, it is best to return sha1. In addition, when accessing a file, modify it to access the file SHA-1 instead of the ID.
The final Modification result and complete source code of this article are as follows:
import hashlibimport datetimeimport flaskimport pymongoimport bson.binaryimport bson.objectidimport bson.errorsfrom cStringIO import StringIOfrom PIL import Imageapp = flask.Flask(__name__)app.debug = Truedb = pymongo.MongoClient('localhost', 27017).testallow_formats = set(['jpeg', 'png', 'gif'])def save_file(f): content = StringIO(f.read()) try: mime = Image.open(content).format.lower() if mime not in allow_formats: raise IOError() except IOError: flask.abort(400) sha1 = hashlib.sha1(content.getvalue()).hexdigest() c = dict( content=bson.binary.Binary(content.getvalue()), mime=mime, time=datetime.datetime.utcnow(), sha1=sha1, ) try: db.files.save(c) except pymongo.errors.DuplicateKeyError: pass return sha1@app.route('/f/<sha1>')def serve_file(sha1): try: f = db.files.find_one({'sha1': sha1}) if f is None: raise bson.errors.InvalidId() if flask.request.headers.get('If-Modified-Since') == f['time'].ctime(): return flask.Response(status=304) resp = flask.Response(f['content'], mimetype='image/' + f['mime']) resp.headers['Last-Modified'] = f['time'].ctime() return resp except bson.errors.InvalidId: flask.abort(404)@app.route('/upload', methods=['POST'])def upload(): f = flask.request.files['uploaded_file'] sha1 = save_file(f) return flask.redirect('/f/' + str(sha1))@app.route('/')def index(): return ''' <!doctype html>
3. REF
Developing RESTful Web APIs with Python, Flask and MongoDB
Http://www.slideshare.net/nicolaiarocci/developing-restful-web-apis-with-python-flask-and-mongodb
Https://github.com/nicolaiarocci/eve