I started with Machine Learning a while back and had a slightly hard time getting help from the local community. The reason was mostly because the Machine Learning community in general is way behind the state-of-the-art in industry and research. This is true for almost all fields nowadays but with Machine Learning, the issues are more pronounced due to the recent fast-paced developments in the industry.
On the other hand, once you know what to study, things are much easier than many other fields such as security. Here I would outline the plan I followed to get to where I am (which isn’t too far ahead but still a little better than what most people know, IMHO).
So, here’s my guide for getting started with Machine Learning self-study.
- Start with Andrew Ng’s Coursera course — Machine Learning. That’s the advice almost everyone seems to give — and it’s a great advice. The Coursera course is completely basic and eases you in the field with little pre-reqs and not much depth. Be careful though: do not think after completing the course that you are an expert in Machine Learning. It misses quite a few areas and the skills needed to be above average. It does get you started with practicals so you are likely to think you’re already done after finishing the course.
- So, after you complete the courser in its entirety — including the assignments — I suggest you start with Prof. Nando de Freitas’ undergrad course. This is a much more detailed course and would get you a very different view of ML than traditional outlines. Of course, you might have to brush up on your Probability, Calculus and Linear Algebra. You can’t really do anything without these three.
- For the above three, I suggest the following courses:
- Probability: Probability for Life Sciences by UCLA’s Math Department. You can find videos for this easily.
- Calculus: I strongly suggest you go with Virtual University Pakistan’s Calculus-I course by Dr. Faisal Shah Khan. It’s a great course but it’s in Urdu. If you don’t know Urdu, you can find your own series. Please let me know in the comments about great resources for this.
- Linear Algebra: Of course, this can only be done with Gilbert Strang’s Linear Algebra course from OCW.
- After that, you can start with the grad course and the second grad course by Prof. Nando de Freitas. Both have very detailed video lectures.
Of course, you also need to work with tools other than Matlab. I strongly suggest the python PyData stack. The full list would be:
- Python PyData full stack (plus go through their yearly videos as well)
That’s what I have till now. I might add more when I know more inshaallah.
I’ve just started with another Coursera course — this one about learning in general. The course is called Learning How to Learn: Powerful mental tools to help you master tough subjects. It’s actually a fairly easy going course, as far as I can see. The assignments and quizzes are fairly straight forward for the most part but the important bit is that the instructors share their life experiences about learning. I hope to be able to get through this course — I have enough ambition that I’ve even signed up for the paid “Signature Track” version of the course.
One important mental tool that I found really interesting is how to use the diffused thought model to get new ideas regarding difficult to solve problems. It’s best explained in the videos through Edison’s example: He would sit on his chair and let his hand hang on a side — while holding a few ball bearings in it. He would then relax and let his mind wander, drifting off towards sleep. The mind would shift to diffused thinking and would eventually find some new avenue to explore to help solve the issue at hand. This usually happens when you’re about to fall asleep — and that is where the ball bearings come into play. They would fall down creating a bit of a racket pulling him back from sleep so that he could grasp the fledging ideas and put them on paper. Cool trick!
Here’s a mini howto on backing up files on a remote machine using
rsync. It shows the progress while it does its thing and updates any remote files while keeping files on the remote end that were deleted from your local folder.
rsync -v -r --update --progress -e ssh /media/nam/Documents/ firstname.lastname@example.org:/media/nam/backup/documents/
/media/nam/Documents/ is the local folder and
/media/nam/backup/documents/ is the backup folder on the machine with IP
I am a teacher and researcher working in the field of computer science. Find out more about me on the About page or see my research output on the Publications page.
Head over to the courses list for discounts on my self-paced online courses.
So, Admob was acquired a while ago by Google and it was recently announced that the publisher reports by Admob would no longer be available through the old APIs. Instead, they now have to be retrieved through the AdSense API — which is based on OAuth 2.0 and thus a real pain for those just getting started.
Turns out, the process is quite straight-forward but extremely poorly documented. You can go through the AdSense reporting docs, the Google API library and the OAuth 2.0 specs but you would soon be lost. After spending a couple of days decoding the requirements, I found out the bare-metal approach to accessing the stats. And here is how.
Read More »
Well, that was short-lived. I moved away from WordPress — only to come back after around 6 months and one post. Seems like Octopress is too much of a hassle for someone as unstable as me. Maybe another time when I’m more focused.
We’re going to use the the Hadoop tarball we compiled earlier to run a pseudo-cluster. That means we will run a one-node cluster on a single machine. If you haven’t already read the tutorial on building the tarball, please head over and do that first.
Geting started with Hadoop 2.2.0 — Building
Start up your (virtual) machine and login as the user ‘hadoop’. First, we’re going to setup the essentials required to run Hadoop. By the way, if you are running a VM, I suggest you kill the machine used for building Hadoop and re-start from a fresh instance of Ubuntu to avoid any issues with compatibility later. For reference, the OS we are using is 64-bit Ubuntu 12.04.3 LTS.
Read More »
I wrote a tutorial on getting started with Hadoop back in the day (around mid 2010). Turns out that the distro has moved on quite a bit with the latest versions. The tutorial is unlikely to work. I tried setting up Hadoop on a single-node “cluster” using Michael Knoll’s excellent tutorial but that too was out of date. And of course, the official documentation on Hadoop’s site is lame.
Having struggled for two days, I finally got the steps smoothed out and this is an effort to document it for future use.
Read More »
I return with a minor post after another long break. This time, it’s about writing better English. Now, this isn’t humblebragging but I cannot be considered excellent at English writing — at least not by native standards. English is not my first language and I haven’t had much formal English education. I have, however, read a lot. Even if my English is not good, I can still point out some tips shared by experts.
Here’s the first one of those shared by Amanda Patterson on Writers Write. It’s a list of 45 words you can use to put emphasis on words without using the word “very”. I found it refreshingly helpful.
Bear in mind though that you cannot just go ahead and use a word without looking up its usage examples. Some words might have negative connotations even though the dictionary meanings look positive. For example, if you use the word ‘adequate‘ to describe someone’s work, they might be offended even though the dictionary meaning is that of acceptable quality.
p.s. After writing this, I searched for the word “very” and found two instances where I had used the word myself. I replaced it with better alternatives.
So you’ve started working with Django and you love the admin interface that you get for free with your models. You deploy half of your app with the admin interface and are about to release when you figure out that anyone who can modify a model can do anything with it. There is no concept of “ownership” of records!
Let me give you an example. Let’s say we’re creating a little MIS for the computer science department where each faculty member can put in his courses and record the course execution (what was done per lecture). That would be a nice application. (In fact, it’s available open source on github and that is what this tutorial is referring to.) However, the issue is that all instructors can access all the course records and there is no way of ensuring that an instructor can modify only the courses that s/he taught. This isn’t easily possible because admin doesn’t not have “row-level permissions”.
Read More »