Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just spent a few minutes playing with the Watson Custom Model for vision flow and let's just say I am totally disappointed is an understatement, few things I noticed: 1. You first need to register an account, and to my surprise there is no command line tool or REST APIs, the entire interface is written in HTML. Hmmm, are they expecting me to specify the network structure by pressing buttons 2. Okay next, after choosing the visual model, it leads you directly to a web page with a bunch of widgets where you can add classes and negatives. To a seasoned ML engineers, this whole interface is useless. The classification has to be done at a full image level, no way to define the layers, the loss function, or any knobs to play around with the network. To an amateur, this is also very confusing. What are they expecting us to drag in to the negatives, if it's a logistic classifier, I could understand but for classifying an image, what exactly do you expect us to put? 3. Btw, to upload images, they expect .zip format, and this is where i stopped. Do they seriously think I will now export this so-called "model" to CoreML and load it to my Xcode?

If they came up with this 5 years ago I might play with it a little longer, but don't the IBM engineers keep up with what's going on at GOOGL, FB or AMZN. I can't possibly imagine anyone using this to develop iPhone apps for the purpose of image recognition, even if it's an offline flow.



Of course you can't imagine anyone using it because (a) you are not the target audience and (b) you are being deliberately contemptuous about the product because it was built by IBM.

If you simply re-read all your own points from an objective standpoint, it should be apparent that this is geared towards individuals who have minimal or no machine learning (much less deep learning) experience; but nevertheless feel they need features like custom image recognition in their application. Rather than spending time and money hiring a 'seasoned ML engineer' such as yourself, they can try this and see if it works well enough for their purposes. Everything from the HTML interface, dearth of model customization, no parameter tuning, etc. points to this use case. Yes, it will be tedious, time consuming, and perhaps a bit unintuitive at first but it will be nowhere near as difficult for them than if they were to build an equivalent data pipeline, neural network, and evaluation setup on specialized hardware using Tensorflow. From that perspective, this could be a great product for application developers.

Finally, there are tons of REST APIs that enumerate all the functionality found here. They are all part of the Watson Cloud catalog. This includes loading data, training, and deploying models. Moreover, is it really necessary to insult IBM engineers by insinuating that they haven't kept up with the broader paradigm shifts in the field? They build what they are told to build by management (just like at the Big 4).


Perhaps I'm not understanding the intent of this collaboration between Apple and IBM but I would like to think that anyone who can write/publish iOS apps should have the aptitude to spend a couple hours understand the basics of deep learning. Would you honestly use an app that was contains an image-classification model that's trained using this flow. Please enlighten me. Or am I the only person who DD'd on their product? Have you tried other web interface versions of online models trainers like SageMaker, Rekognition? Do you work for IBM?


I graduated with honours, did 3D graphics programming in the past, systems programming is one of my favourite areas, worked at few well known names internationally.

Yet I can't get my head around neural networks and related concepts.

Just because it is easy for you, don't assume the same for everyone else.


REST API here: https://www.ibm.com/watson/developercloud/visual-recognition...

"no way to define the layers, the loss function, or any knobs to play around with the network". -- This is the point of the service - to enable the (vast group of) users who want to custom-classify images of X/Y/Z without having to understand the difference between momentum and learning rate, or hire people who do. If you do want full control of the model, you should look at Deep Learning as a Service - https://www.ibm.com/cloud/deep-learning


Command line tool here. https://www.ibm.com/cloud/cli


I think there is definitely space for improvement for 2 and 3, but then you would need to collect your own data. Are you a developer? We should talk if you're considering CoreML as we (Polarr) have ready to use, battle tested CoreML models for various computer vision purposes (email in my profile.)


I have used both CoreML and MPSCNN in the past, the pain point is typically how to translate a cloud-based model to run directly on the phone. Caffe2 to CoreML is good so far, but issue is CoreML is a black box and crashes are often not decipherable. I'm looking more in a universal flow that can port models trained in TensorFlow or PyTorch and at the same time some way to debug intermediate results. Or better yet, if there are existing mobile models that can be used is best. Just reached out!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: