I am supportive of clear explanations of some of the building blocks, but I worry repeatedly describing things as being grade school math level gives the wrong impression about the actual learning curve for getting up to speed on working with CNN's. Yes, the building blocks are easy to understand, but actually understanding why a given network structure, or optimization technique isn't working, is a black art. And if you don't have a workstation with a $2k gpu or two, you're probably not going to have a good time.
I setup an automated script to setup a AWS g2 instance, train my neural net using tensorflow, copy my model to my personal computer, and spin down. It costs like $5-$10 to train/test most neural network models. My most expensive model cost like $100 and required a ton of time and resources. It took like 4 days or something.
You really dont need $2k workstation..
Of course, for personal use I do have a gtx 1080 because I like to game and play tensorflow/caffe
Spot instances cost way cheaper. The only downside is you need to create an AMI everytime before termination.
But, also, AWS g2 has NVIDIA Grid K50 with 4GB memory, so it's not very good with performance.
I tried to get it released before, but was shut down by the "open source office". So I can't give you the exact script. However, h2o has a script that launches a cluster, that's very similar:
Even if it may not matter in this case specifically, this is a terrible example. "sudo chmod 777" in public code is basically "I don't know what I'm doing, but go on, do the same thing yourself" :(
You don't need sudo, because it's your file. You need just "chmod 755", not "777". And you don't need chmod in the first place - just run "python script_demo.py".
Very true it doesn't need a 777. It's also not my script, just copied and pasted some code that fit what I was doing in mine. Basically, it follows the same format, specific implementation varys.
Also, you are being rather pedantic here. If people can figure out how to take what I copied & pasted and turn it into a script they can probably know chmod 777 isn't great.
That being said, and why I think this is ridiculous, is that you are assuming this matters. Going into the weeds here, to play along: The script is immediately being ran, on temporary and very recently launched ec2 instance, probably with the use of a pem and that presumably can even be part of AWS security group that only allows your IP, and is shut down following it's execution.
I can't picture this being a security vulnerability at all. Calling it a terrible example is relative - I wrote this copy and pasted on a cell phone trying to help someone. Honestly, didn't even see the "sudo chmod 777", just pasted away.
As I said: "Even if it may not matter in this case". But what about the next time? (poster obviously doesn't understand the code) What about people who read this comment and follow the advice because it works? (happens all the time with SO) What about when your script becomes the standard deployment method for the company? (happens everywhere) What about people who don't know about file privileges either and blindly copy what they see here?
There's lots of things we do that you could say "who cares?" about. Single letter variables. Comments. Const-correctness. Unnecessary N^2 algorithms. Usually it turns out that either you or someone you work with cares a few months/years afterwards. So just learn to do it correctly the first time. Especially if the correct way takes less time than the "magic fix".
These things come about because people learn the simple way to solve all permission problems is just do everything as su (or root) and chmod everything to 777. The problem is it does solve all problems (well sweep them under the rug) - I guess when the only tool you have is a hammer everything looks like a nail.
Yes. On the other hand, there isn't a strong overlap with machine learning experts and security experts(I'm sure there will be in coming years though, as security experts start using NNs to detect anomalous/dangerous behavior).
I know what you mean, but I disagree with "security experts". You don't need to be a security expert, just know the basics of how privileges work on your system. And if people don't, we just need to keep calling it out, because otherwise code like this ends up in production one day, where it actually matters.
Exactly. If you're a ML person, you must be working with Linux in most cases. And in such case, you need to be familiar with the types of permissions, cause that's relevant even during the installation of various libraries, doing ssh etc.
Basically, I couldn't get authorization to publish internal code. In this case, it was a combination of who was going to maintain it and ensuring nothing internal was leaked.
Question: I have a database with 1.000.000 vehicle pictures, organized by make and model. What would be the easiest way to play with this data, so that I can train it to predict the make / model? I don't want to reinvent the wheel now so much tutorials are written and software is being released. What would be the easiest way to start?
This is the easiest way to setup a CNN and train it with your sample images (at least compared to Caffe, Tensorflow, and Theano). I say that because it's all GUI based! Real convenient.
Good question. Detecting just one class (in this case, cats) will require negative examples. Finding good negatives is somewhat of challenging task because they should be pretty comprehensive, but if you create an account on image-net (http://image-net.org/), then you can download thousands of images.
It's not cars, but mostly tractors. I have been running an online community for almost 10 years now and the images have been submitted and organized by the members.
> CNNs can be used to categorize other types of data too. The trick is, whatever data type you start with, to transform it to make it look like an image.
This is an interesting point, and I assume that 'make it look look like an image' means the same thing as 'think of it as an image'. Can others here who works with CNNs regularly or professionally, comment on whether the author's intuition is essentially correct (give or take some details of course)?
It comes down to the characterizing architecture of convolutional nets, that is weight sharing, and the assumption on data this makes. If by image one means something where you can expect any pattern (at some level in the hierarchy) being equally likely to occur anywhere across an input dimension, then yes this is true. Personally I would say that this is too narrow of a definition of an image (too great of an assumption), and, interestingly enough, perhaps too broad too. I am not a pro.
[Edit] Too broad in the sense that, intuitively, there is perhaps an implied assumption of continuity of the input function defining the image. Note that such assumptions can be made explicit with various so-called statistical priors incorporated in the network.
Really great write up. Ive been trying to wrap my not too mathematically talented head around convolutional filters and this really helped in visualizing what is happening.
Great post. I like how he didn't go into too much detail on the math of backprop etc. I find the conceptual understanding of ML is more interesting as a lay person.
If you google "Hinton machine learning" on youtube, you will find hinton's lecture's they are non-mathematical, he is a psychologist/math guy, and he is the inventor of almost all this stuff, backprop, drop-out,
You will find his lectures to be very entertaining and easy to understand, being a psychologist whose desire is to make a computer operate like a human brain, he's more interested in how the brain actually works, than hacking ML code.
Hinton describes backprop, why he invented it, and exactly how it emulates the way the human brain works.
Hinton now works at microsoft, he is considered the modern day 'godfather' of DeepLearning/ML
>You will find his lectures to be very entertaining and easy to understand, being a psychologist whose desire is to make a computer operate like a human brain, he's more interested in how the brain actually works, than hacking ML code.
>
>Hinton describes backprop, why he invented it, and exactly how it emulates the way the human brain works.
Of course, basically no actual neuroscientists or cognitive scientists think the brain actually works via supervised backpropagation. So he actually has a bit of a holy war going on with the people who properly work on human learning rather than machine learning.
So ist works just like i thought it would. Why are CNN so hyped? Wasnt all this already known decades ago? Or is it just because we can afford the computing power?
The basic CNN structure was in place, but as the saying goes, "The Devil's in the details." Early CNN's were applied to problems such as handwritten character recognition with rows of small grayscale image cells as inputs, and were much shallower, smaller models. Today's CNN's operate on full resolution, multi-channel images and video, and can be orders of magnitudes deeper and larger. For instance, ResNets have been proven to demonstrate monotonic performance improvements out to 1200 layers on benchmark datasets. This would have been unthinkable even a couple years ago. By way of comparison, even the state of the art VGG network architecture of a couple years ago originally had to be trained in stages to reach 16 and 19 layers for submission to ILSVRC 2014 (Xavier / MSRA initialization makes this unnecessary now). At the time, VGG and GoogleNet (22 layers) were considered to be extraordinarily deep CNN's.
The underlying math was figured out a long time ago, but it's only been in recent years that we've had the computing power to test these out on lots of complicated, real-world classification problems, and had some incredible success.
I argued back in 2000 (a year after I got my computer engineering degree) that AI wouldn't take off until computing moved from single threaded/single core to multithreaded/ multicore processing. The fact that we are only hearing about this stuff 15 years later makes me feel that that assertion was largely right.
The biggest problem I see in AI is that the algorithms are generally fairly straightforward, but people haven't had the computing power to explore the problem space. We are seeing drastic improvement in things like video cards (routinely 1000+ cores) and data processing locality (map reduce). But processors have stagnated.
If we really want AI in any reasonable timescale, we need large arrays of general-purpose cores with a sane communication protocol that doesn't fixate on things like caching, we need a hybrid between Go and Erlang to do concurrent functional programming in a readable way with automagic scaling over a network, and we need all this yesterday. The fancy schmancy AI algorithms will become apparent when processing power is no longer the primary limitation, and at that point we can optimize them.
Decades ago I played around with neural nets but was frustrated because I either had to preprocess and normalize my inputs to the point where I didn't need a network anymore or I had to train a large network with so much data that it was not practical.
Having a cookbook approach with a catchy name and orders of magnitude more processing power have revived neural nets and now they are finally doing something useful.
Now everyone is jumping on the bandwagon so the field is progressing very quickly. Just because it's hyped doesn't mean it's not worth giving it a second look (although I'm still on the sidelines myself.)
> I suspect that BIG-OIL, and BIG NSA of today have stuff that is super good and advanced and most of what they leak to GIT HUB is just garbage
I don't think it works that way now. What I see is timely publishing of papers, code and sometimes, data. It's more advantageous to cooperate.
The bottleneck is not caused by algorithms, but expert knowledge on their fine-tuning and correct application. We have lots of algorithms already, and more are published. They are not "garbage", if used properly.