The Stanford online Machine Learning class
In case you haven't been following it, Stanford's computer science department began a grand experiment in online learning early this month: free, upper division college courses, given online and open to the whole world. There are three classes offered: Artificial Intelligence, Machine Learning and Introduction to Databases.
They've sparked an incredible response: exact numbers don't seem to be available, but rumor is that AI had about 130,000 enrolees, while ML had about 70,000. (Nobody seems to have published numbers for DB.) Update, a day later: @seemsArtless tweets that ML currently has 87,000 registered users.
Why so much interest? Surely there are lots of places to get free information (like wikipedia) and even course lectures (like MIT). And there are plenty of places to take classes for relatively low cost, like local junior colleges or ed2go.
What's different about the Stanford classes is that they cover advanced material, in far more depth than you'd find at a junior college or typical online site. They offer graded homework so you can see how you're doing, and there are other students taking the class at the same time, so if you get stuck, there are all sorts of discussion groups you can turn to. It's one thing to read a textbook or watch a video by yourself; I find a class much more helpful, and judging by the response to the Stanford classes, I'm not alone in that.
I agonized over whether to take AI or Machine Learning. They both sounded so interesting! Since I couldn't decide, I initially signed up for both, figuring I'd drop one if the load was too great. By the end of the second week, I'd settled on Machine Learning. I was starting to dread the AI class flash quizzes -- which didn't always work right, but made it hard to proceed until you'd answered the question right even if you couldn't see the question -- and to feel frustrated about the lectures, which clearly were meant as a jumping off point for students to go do their own outside reading.
On the other hand, I was really enjoying the Machine Learning lectures, and looking forward to each new one. And the real kicker: Machine Learning includes programming assignments, so students can implement the algorithms Professor Ng talks about in the lectures.
What's great about Machine Learning
Andrew Ng's video lectures are wonderfully clear, well paced and full of interesting content.
He uses a lot of graphs to help students visualize what's going on geometrically, rather than just relying on the equations. (Better yet, in the programming exercises he shows us how to create those graphs for ourselves.)
And he's great about flagging certain portions as possibly review (you can skip this lecture if you already know linear algebra) or advanced (this is some extra background for people who know calculus, but you can skip it and still do fine in the course).
The technology is simpler than that used in the AI course. If you have a slow net connection or travel a lot, you can download the lectures as mp4 files and watch them offline. You can download lecture slides as a PDF or PPT. Review questions (graded) are handled with simple HTML forms. All very simple, well-tested technology, and it works great. I've had no problems accessing the servers, submitting homework or anything else -- very impressive!
But the heart of the course is the programming exercises. ML is taught in GNU octave, a framework and language for numerical computing and matrix operations. Students aren't absolutely required to use octave, but it's highly recommended: Professor Ng says he's found that students learn much faster that way. Sounds good to me, and octave looks like a useful skill, well worth acquiring. I'm having fun learning it.
The programming exercises come with a lot of scaffold code plus a few files with "Your code goes here". The actual amount of coding isn't large. But I'm finding that it does the job: it forces me to make sure I understand the matrix operations discussed in the lectures. And at the end, you come out with something that's actually useful! From the first few weeks, I have linear and logistic regression code that I could use to analyze and visualize all sorts of datasets. Now, at the end of week 4, we're halfway through writing a neural network to recognize handwritten numerals from image data. How cool is that?
Suggestions for improvement
The class is a huge success. Who would have thought that you could teach something this advanced on such a huge scale, so effectively?
I have only a couple of small suggestions -- ways the class could be even better next time.
- An errata page. In week 3, there was an error in the lecture and notes, a - instead of a +, that made one part of programming ex. 2 quite a bit trickier than it would otherwise have been. If I hadn't noticed that the slides used + in some places and - in others, I might never have gotten that part of the assignment working. Lots of other people found that too, and there were discussions in the Q&A forum ... but you wouldn't find it without coming up with clever search terms.
- The Q&A forum would be so much more useful if it was organized by topic, and/or by week. There are some great discussions there, but the only way of getting to them is by searching for the right terms. There's no way to browse discussions, see how people are doing on assignment 3, or look for errata and similar warnings. It would help make the class more of a community, more like a real in-person class.
Hope for future expansion
I mentioned my suggestions because I fervently hope there is a "next time". These classes are a great service, and I hope the huge response isn't putting too much burden on the instructors.
"Common wisdom" among providers of online classes seems to be that there's no demand outside of enrolled university students for hard courses, courses with prerequisites, and especially courses that involve (shudder) math. Just look at the offerings from any online courseware or adult ed program -- they're long on art appreciation and "Introduction to MS Word", short on physics and econometrics. Even the for-pay online degree mills concentrate on humanities and business, not technical subjects.
Stanford's experiment has proven that "common wisdom" is wrong -- that tens of thousands of students will jump at the chance to take highly technical, mathematical courses. I'd love to see the model expanded to other subjects, such as statistics, economics, physics, geology and climate science.
And, yes, there is money to be made here. If this many people will take a free class, wouldn't quite a few of them be willing to pay? Most couldn't afford $1000 like UC Extension classes -- but how about $100, comparable to other online education classes? Would people pay more if you offered college credit?
Online education providers, take note! There's a large, underserved market for scientific and technical classes out here in the long tail.
[ 17:31 Nov 05, 2011 More education | permalink to this entry | ]