李飛飛:如何教電腦理解圖片_電腦

來源:互聯網
上載者:User

2016年早期讀論文階段,我第一次接觸Artificial Intelligence,當時只覺得這個詞彙,真難拼,聽起來逼格蠻高的,cool;2017年,人工智慧已經成了媒體年度新詞,甚至帶動了一票AI概念股漲起來,比如目前時間(2017/12/29 15:00:00 北京時間)PE值已經在365.29的科大訊飛。2017年4、5月份假期,又聽了Andrew Ng的半節《Machine Learning》,半途而廢。錯失了兩次瞭解新技術,甚至是新技術革命的良機,2018年的第一天,把李飛飛的這個Computer Vision的TED刷一下好了,正好好久沒有練英文聽寫了。

Let me show you something…
some pictures are shown, and a girl is describing what does the picture have…
the boy is …
those are the …
that’s a big airplane …

This is a three year old child describing what she sees in a series of photos. She may still have a lot to learn about this world, but she’d already an expert at one very important task to make sense what she sees. Our society is more technologically advanced than ever. We sent people to the moon, we make phones that talk to us or customise radio stations that can play only music we like. Yet, our most advanced machines and computers still struggle at this task. So I’m here today, to give you a progress report on latest advances in out research in computer vision, one of the most frontier and potentially revolutionary technology in computer science. Yes, we have prototyped of cars that can drive by themselves, but without smart vision, they cannot really tell the difference between a crumpled paper bag no the road, which can be run over, and a rock that size, which should be avoided. We have made fabulous megapixel cameras, but we have not delivered sight to the blind. Drones can fly over massive land, but don’t have enough vision technology to help us track the changes of the rainforests. Security cameras are everywhere, but they do not alert us when their child is drawning in a swimming pool. Photos and videos are becoming an integral part of global life. They’re being generated at a pace that’s far beyond what any human, or team of human, could hope to view. And you and I are contributing to that at this TED. Yet our most advanced software is still struggling at understanding or managing these enormous content. So in other words, collectively as a society, we’re very much blind, because our smartest machine are still blind.

“Why is this so hard?” you may ask. Camera can take pictures like this one, by converting lights into a two-dimensional array of numbers known as pixels, but these are just lifeless numbers. They do not carry meaning in themselves. Just as to hear is not the same as to listen, to take pictures is not the same as to see, and by seeing we really mean by understanding.

In fact, it took Mother Nature 540 millions year of hard work to do this task, and much of that effort went into developing the visual processing apparatus of our brains, not the eyes themselves. So the vision begins with the eye, but it truly take place in the brain.

So for fifteen years now, starting from my Ph.D. at Caltech and then leading Standford’s Vision Lab, I’ve been working with my mentors, collaborators and students to teach computers to see. Our research filed is called computer vision and machine learning. It’s part of the general filed of aritificail intelligence.

So ultimately, we want to teach the machines to see just what we do: naming objects, identify people, inferring 3D geometry of things, understanding relations, emotions, actions and intensions. You and I weave together entire stories of people, places and things the moment we lay our gaze on them. The first step towards this goal is to teach a computer to see objects, the building block of the visual world. In a simplest terms, imagine this teching process as showing the computer some training images of a particular object, let’s say cats, and designing a model that learns from these training images. How hard can this be? Tomorrow answer will been shown haha

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.